PATTERN Cited by 1 source
Five questions knowledge extraction¶
Intent¶
Extract the tribal knowledge the AI coding agent actually needs per module by forcing five targeted questions that drive the depth of the extraction — rather than asking for "documentation" of the module and accepting a surface narrative.
The five questions¶
Meta's framework (Source: sources/2026-04-06-meta-how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines):
| # | Question | What it extracts |
|---|---|---|
| 1 | What does this module configure? | Surface purpose; module-level comments + docs |
| 2 | What are the common modification patterns? | Recent-commit rhythm; what engineers actually change |
| 3 | What are the non-obvious patterns that cause build failures? | Pure tribal knowledge; failure-oriented |
| 4 | What are the cross-module dependencies? | Subsystem-coupling invariants |
| 5 | What tribal knowledge is buried in code comments? | Inline gotchas, "DO NOT REMOVE" markers, TODOs with blast-radius warnings |
Meta's explicit finding: Question 5 produced the deepest learnings — 50+ non-obvious patterns like hidden intermediate naming conventions and append-only deprecated-enum rules. "None of this had been written down before."
Why these five specifically¶
The questions are ordered from surface to deep and feature- oriented to failure-oriented:
- Q1 (purpose) is the question most documentation answers; it sets context for the rest.
- Q2 (modifications) reveals the typical agent task shape — agents are more likely to be adding-a-field than designing-a-new- module.
- Q3 (failure patterns) is the first tribal-knowledge question and the one that produces silent-wrong-output mitigations.
- Q4 (dependencies) forces cross-module invariants into view; these rarely live in any one file.
- Q5 (comment-buried knowledge) closes the loop by literally re-reading the code to find annotations engineers left behind.
The shape is deliberately failure-first, not feature-first — the agent needs to know what to avoid more than what's available.
How to run it¶
Each module analyst agent:
- Reads all files in the module + recent commit history.
- Answers each of the five questions in order.
- Emits structured output → feeds the writer agents.
Run in parallel across modules — Meta uses 11 module analyst agents simultaneously on one session.
Output shape¶
Answers map 1:1 to sections in the downstream compass-not-encyclopedia context file:
| Question | Context-file section |
|---|---|
| Q1 + Q2 | Quick Commands |
| Q2 + Q4 | Key Files |
| Q3 + Q5 | Non-Obvious Patterns — the highest-value section |
| Q4 | See Also |
This alignment is not accidental — the five-questions framework is designed to feed the four-section file format.
Contrast with documentation approaches¶
| Approach | Primary question | Output shape |
|---|---|---|
| Divio framework (tutorials / how-to / reference / explanation) | "How do I teach this?" | Four doc categories |
| Javadoc / RustDoc / pydoc | "What does this method do?" | API-surface docs |
| README-driven development | "How would someone adopt this?" | Adoption-oriented narratives |
| Five-questions framework (this) | "What breaks if the agent doesn't know this?" | Failure-oriented navigation files |
The five-questions framework is distinctive in orienting around failure modes — questions 3 and 5 both target knowledge whose absence produces silent wrong output.
Tradeoffs¶
- Lossy by design — the framework skips architecture / design- history / aesthetic axes. Suitable for agent context, not for learning the system as a human.
- Requires mature code + commits + comments — on a newly written module, Q5 returns empty. The framework is extraction, not generation.
- Q3 is the hardest question — reliably identifying "non-obvious patterns that cause build failures" requires either an analyst with deep context or a large-context model that has read adjacent failure postmortems.
Applicable beyond Meta's case¶
- Onboarding docs for new engineers — same failure-first shape, but answered by humans.
- Migration guides — the five questions applied to the migrating codebase produce the pre-migration invariant list.
- Runbooks for operators — Q3 ("what breaks") is the runbook's core; Q1 + Q2 set context.
Meta's fifth apply-it-yourself step specifically names the framework as reusable: "Use the 'five questions' framework. Have agents (or engineers) answer: what does it do, how do you modify it, what breaks, what depends on it, and what's undocumented?"
Seen in¶
- Meta AI Pre-Compute Engine (2026-04-06) — canonical wiki instance. 11 module analyst agents apply the five-questions framework to every module in a 4,100-file config-as-code data pipeline. 50+ non-obvious patterns surfaced, concentrated in Q5 answers. "None of this had been written down before." (Source: sources/2026-04-06-meta-how-meta-used-ai-to-map-tribal-knowledge-in-large-scale-data-pipelines.)
Related¶
- concepts/tribal-knowledge — what Q3 + Q5 target
- concepts/compass-not-encyclopedia — the output format these questions feed
- concepts/config-as-code-pipeline — the context where Q3 + Q4 are highest-yield
- patterns/precomputed-agent-context-files — the containing pattern
- patterns/multi-round-critic-quality-gate — the quality gate that validates the extracted content
- systems/meta-ai-precompute-engine — the canonical user