YaCy Packs
YaCy Packs are ready-made index bundles that let you plug knowledge into any YaCy instance in minutes.
Think of them as collectible “data cartridges” you can mix, stack, or remove at will—no crawling hours,
no external dependencies.
These are the advantages of YaCy Packs: - Instant knowledge – skip multi-day crawls. - Mix-and-match – combine core + scroll for a dev-knowledge box. - Privacy – all data ships offline; nothing phones home. - Reproducible – same pack → same search experience on any node. - Shareable – send a friend the pack; they drop it into PACKS/hold.
File naming (one glance tells the story)
YaCy Packs are files that should follow the following naming structure:
YaCy-Pack_<category>-<tier>-<origin>_<slug>_<YYYYMMDD>.jsonl
The name of a pack provides metadata about the content of the pack, that is not only a name but also a qualification about the size, type of content, amount of work done to create the pack and more. Here are the details:
Part | Values & rules |
---|---|
category |
one of: core , scroll , codex , gem , fiction , map , echo , spirit , vault |
tier |
common , uncommon , rare , epic , legendary |
origin |
web , synth , archive , sim , mirror , corpus , partial |
slug |
description of the content, should be collection of the index |
YYYYMMDD |
four-digit year + month + day (production date) |
Example: YaCy-Pack_scroll-common-web_top-1000_20250801.jsonl
The group <category>-<tier>-<origin>
has the metadata about the content while <slug>
is a free-text
field that the creator can use to describe the content of the pack. We encourage every creator to fill
into <slug>
simply the name of the collection that can be assigned during a web crawl. We want to encourage
every user of YaCy to use that field with the possible creation of a pack in mind.
Categories at a glance
The <category>
part of the pack name makes it easy to understand what the content is about. We define
the following category names that are availabe during pack generation:
Category | What’s inside |
---|---|
core | technical documentation, operation systems, computer hardware, open source and free software, manuals, protocol standards |
scroll | non-technical documents: knowledge, encyclopedia, linguistic corpora, dictionaries, translation memories, texts, non-fiction books, historical books |
codex | non-technical standards: industry standards, laws, rules, compliance |
gem | research, papers, university publications, science |
fiction | fictional documents: movies, stories, series, books (fiction, science-fiction) |
map | geological data, geolocation-data, earth/world information |
echo | micro-content (tweets, toots, short headlines, SMS corpora), podcasts, radio archives, audio lectures, spoken-word datasets, logs, incidents, telemetry |
spirit | related to non-textual data (possibly only metadata): art, music, game assets, creative-commons media (non-text culture loot) |
vault | sensitive data: secrets, leaks, non-public documents, security advisories |
Tiers – how hard was this loot to get?
Packs can be seen like 'loot' in games - they enable the user to have a better tool. We want to have a classification that shows how much work has been done to create the pack. YaCy packs may be created outside of YaCy with other tools (like a wikipedia parser) which may be more appropriate to create multi-million size index files. The tier shows therefore some kind of value that a pack has.
Tier | Rule of thumb |
---|---|
common | harvestable on a 10-year-old PC within a day |
uncommon | needs fast CPU / big pipe, or > 1 day |
rare | custom parsers or manual curation required |
epic | multi-week crawl or special infrastructure |
legendary | one-off craft or enterprise-level effort |
Life-cycle (where the pack sits)
YaCy Packs are simply json-list files that can be imported. The import process is done by writing
the pack into an arrival location, that is the path DATA/PACK/hold
. From this location the pack
can be used to be loaded into the index by moving it to DATA/PACK/load
. A monitoring process in
YaCy then detects the file and writes it into the YaCy index. After that wrinting is finished,
the pack file is moved to DATA/PACK/loaded
.
The full directory tree for YaCy packs is:
DATA/PACKS/
├── hold/
├── load/
├── loaded/
└── unload/