CONCEPT Cited by 1 source
Dewey Decimal Classification¶
Dewey Decimal Classification (DDC) is the universal library classification system that organises knowledge into ten primary classes (000–900) with hierarchical decimal subdivisions. In the sysdesign-wiki corpus it shows up as the categorical taxonomy an LLM-driven document-classification pipeline emits per page/document.
Stub page. Single-source instance so far. Included on the wiki because DDC is a real architectural choice (a standard public taxonomy with existing tooling, search vocabulary, and cross-archive interoperability), not because the wiki cares about library science.
Why DDC, architecturally¶
When you build a document-classification pipeline against a domain archive, you have to pick the output vocabulary:
- Custom domain taxonomy. Define your own categories. Highest precision; zero portability; new partners must learn your tags.
- Domain-specific standard taxonomy (e.g. for hydrogeology: USGS classifications). Mid-precision; portable within the field; not recognised outside it.
- Universal taxonomy (Dewey Decimal, Library of Congress, MeSH). Lower-precision-per-document; maximally portable; instantly queryable by any researcher who knows the system.
The MapAid groundwater pipeline picks (3) — Dewey Decimal codes — because the archive serves humanitarian researchers, university partners, and government agencies across multiple disciplines, and DDC is the existing lingua franca for cross-discipline document discovery.
"The model examines each page image and returns: Dewey Decimal classification codes, the universal library classification system…" (Source: sources/2026-05-11-databricks-unlocking-the-archives)
The architectural lesson generalises: when the consumers of your classification span organizations, prefer a public standard taxonomy over a custom one. The cost is per-document precision; the gain is zero-friction discoverability.
Seen in¶
- sources/2026-05-11-databricks-unlocking-the-archives — emitted by the multimodal classification pass alongside Sudanese geographic tags and a water-relevance flag. The judge model scores each document's DDC assignment for accuracy.