PATTERN Cited by 1 source

Five questions knowledge extraction¶

Intent¶

Extract the tribal knowledge the AI coding agent actually needs per module by forcing five targeted questions that drive the depth of the extraction — rather than asking for "documentation" of the module and accepting a surface narrative.

The five questions¶

Meta's framework (Source: sources/2026-04-06-meta-how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines):

#	Question	What it extracts
1	What does this module configure?	Surface purpose; module-level comments + docs
2	What are the common modification patterns?	Recent-commit rhythm; what engineers actually change
3	What are the non-obvious patterns that cause build failures?	Pure tribal knowledge; failure-oriented
4	What are the cross-module dependencies?	Subsystem-coupling invariants
5	What tribal knowledge is buried in code comments?	Inline gotchas, "DO NOT REMOVE" markers, TODOs with blast-radius warnings

Meta's explicit finding: Question 5 produced the deepest learnings — 50+ non-obvious patterns like hidden intermediate naming conventions and append-only deprecated-enum rules. "None of this had been written down before."

Why these five specifically¶

The questions are ordered from surface to deep and feature- oriented to failure-oriented:

Q1 (purpose) is the question most documentation answers; it sets context for the rest.
Q2 (modifications) reveals the typical agent task shape — agents are more likely to be adding-a-field than designing-a-new- module.
Q3 (failure patterns) is the first tribal-knowledge question and the one that produces silent-wrong-output mitigations.
Q4 (dependencies) forces cross-module invariants into view; these rarely live in any one file.
Q5 (comment-buried knowledge) closes the loop by literally re-reading the code to find annotations engineers left behind.

The shape is deliberately failure-first, not feature-first — the agent needs to know what to avoid more than what's available.

How to run it¶

Each module analyst agent:

Reads all files in the module + recent commit history.
Answers each of the five questions in order.
Emits structured output → feeds the writer agents.

Run in parallel across modules — Meta uses 11 module analyst agents simultaneously on one session.

Output shape¶

Answers map 1:1 to sections in the downstream compass-not-encyclopedia context file:

Question	Context-file section
Q1 + Q2	Quick Commands
Q2 + Q4	Key Files
Q3 + Q5	Non-Obvious Patterns — the highest-value section
Q4	See Also

This alignment is not accidental — the five-questions framework is designed to feed the four-section file format.

Contrast with documentation approaches¶

Approach	Primary question	Output shape
Divio framework (tutorials / how-to / reference / explanation)	"How do I teach this?"	Four doc categories
Javadoc / RustDoc / pydoc	"What does this method do?"	API-surface docs
README-driven development	"How would someone adopt this?"	Adoption-oriented narratives
Five-questions framework (this)	"What breaks if the agent doesn't know this?"	Failure-oriented navigation files

The five-questions framework is distinctive in orienting around failure modes — questions 3 and 5 both target knowledge whose absence produces silent wrong output.

Tradeoffs¶

Lossy by design — the framework skips architecture / design- history / aesthetic axes. Suitable for agent context, not for learning the system as a human.
Requires mature code + commits + comments — on a newly written module, Q5 returns empty. The framework is extraction, not generation.
Q3 is the hardest question — reliably identifying "non-obvious patterns that cause build failures" requires either an analyst with deep context or a large-context model that has read adjacent failure postmortems.

Applicable beyond Meta's case¶

Onboarding docs for new engineers — same failure-first shape, but answered by humans.
Migration guides — the five questions applied to the migrating codebase produce the pre-migration invariant list.
Runbooks for operators — Q3 ("what breaks") is the runbook's core; Q1 + Q2 set context.

Meta's fifth apply-it-yourself step specifically names the framework as reusable: "Use the 'five questions' framework. Have agents (or engineers) answer: what does it do, how do you modify it, what breaks, what depends on it, and what's undocumented?"

Seen in¶

Meta AI Pre-Compute Engine (2026-04-06) — canonical wiki instance. 11 module analyst agents apply the five-questions framework to every module in a 4,100-file config-as-code data pipeline. 50+ non-obvious patterns surfaced, concentrated in Q5 answers. "None of this had been written down before." (Source: sources/2026-04-06-meta-how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines.)

concepts/tribal-knowledge — what Q3 + Q5 target
concepts/compass-not-encyclopedia — the output format these questions feed
concepts/config-as-code-pipeline — the context where Q3 + Q4 are highest-yield
patterns/precomputed-agent-context-files — the containing pattern
patterns/multi-round-critic-quality-gate — the quality gate that validates the extracted content
systems/meta-ai-precompute-engine — the canonical user