PATTERN Cited by 2 sources

Critic investigation loop¶

Intent¶

Run a long-running investigative agent task (security alert triage, root-cause analysis, deep research, incident review) as a three-persona agent team with a round-based loop:

Director — planner + progressor. Forms questions, decides investigation phases, produces the final report.
N Experts — domain specialists. Each owns a distinct toolset / data-source set; produces findings in response to the Director's question.
Critic — meta-reviewer. Audits Expert findings against a rubric, credibility-scores them, condenses into a timeline, feeds the condensed view back to the Director.

The loop shape: Director asks → Experts answer → Critic reviews → Director receives condensed timeline → Director asks again.

Canonicalised by Slack's Security Engineering team as the core architecture of Slack Spear, their security- investigation agent service (Source: sources/2025-12-01-slack-streamlining-security-investigations-with-agents).

Loop diagram¶

        ┌───────────────────────────────────────────────┐
        │                                               │
        │         Director  (plans + progresses)        │
        │              │                   ▲            │
        │              │ question          │ condensed  │
        │              │                   │ timeline   │
        │              ▼                   │            │
        │      ┌─────────────────┐         │            │
        │      │    Expert A     │─findings┤            │
        │      │    Expert B     │─findings┤            │
        │      │    Expert C     │─findings┤            │
        │      │    Expert D     │─findings┤            │
        │      └─────────────────┘         │            │
        │              │                   │            │
        │              ▼                   │            │
        │         ┌─────────┐ scored       │            │
        │         │ Critic  │─findings─────┘            │
        │         └─────────┘ + timeline                │
        │                                               │
        └───────────────────────────────────────────────┘

Why three personas?¶

Director vs Expert: separate planning from execution¶

Single-persona agents that try to simultaneously plan-the- investigation and execute-individual-queries conflate two different cognitive loads in one prompt. Tool selection errors correlate with tool-inventory growth; planning errors correlate with context-crowding. Separating them gives each persona a small prompt and a focused task.

See patterns/specialized-agent-decomposition for the general specialisation argument.

Critic: the hallucination check¶

Without a Critic, the Director consumes Expert findings directly — including hallucinated tool calls, mis-read data, and plausible-but-wrong inferences. The Critic's job is to catch these before they shape the Director's next question.

Slack's canonical emergent-behaviour disclosure: the Expert missed a credential exposure in a process-ancestry chain; the Critic flagged it; the Director pivoted the investigation. "What is notable about this result is that the expert did not raise the credential exposure in its findings; the Critic noticed it as part of its meta-analysis of the expert's work." (Source: sources/2025-12-01-slack-streamlining-security-investigations-with-agents)

Why not just critic-fixer rounds?¶

Multi-round critic-fixer loops (as in patterns/multi-round-critic-quality-gate) have the Critic gate the Expert's output until it passes. The Director / Expert / Critic loop is higher-level: the Critic augments the Expert output (with credibility scores + analysis) rather than gating it, and the Director decides what to do with the augmented view — progress the investigation, pivot, or conclude. The third persona is the architectural reason this shape isn't just a drafter-evaluator retry loop.

Mechanism¶

1. Per-task model invocations (no mega-prompt)¶

Each persona's task is a separate model invocation with its own structured-output schema. Control flow lives in application code, not in prompt bullets. See patterns/one-model-invocation-per-task and concepts/prompt-is-not-control.

2. Director's journaling tool¶

Slack discloses that the Director "uses a journaling tool for planning and organizing the investigation as it progresses." The journal is a first-class artifact the Director reads + writes across rounds, carrying planning state without re-deriving it from scratch each round.

3. Critic's condensation step¶

The Critic produces two outputs per round:

Per-finding annotations with credibility scores against a defined rubric.
A condensed investigation timeline assembled from the highest-credibility findings, merged with the running timeline.

Only the condensed timeline flows upward to the Director. This is the top half of the knowledge pyramid — high-tier cognition operating on pre-digested input, not raw data. See concepts/knowledge-pyramid-model-tiering.

4. Phase-gated progression¶

The Director decides phase transitions explicitly (discovery → trace → conclude) via a meta-phase invocation. Phase affects which Expert(s) are queried + what model parameters are used. See patterns/phase-gated-investigation-progression.

5. Model tiering (knowledge pyramid)¶

Experts: cheap models (tool-call-heavy, token-intensive but cognitively shallow work).
Critic: mid-tier models (reasoning-dense condensation).
Director: top-tier models (strategic decisions on already-condensed input).

See concepts/knowledge-pyramid-model-tiering.

When to reach for it¶

Task has a natural plan/execute/review decomposition. Security investigations, incident root-cause, compliance audits, deep research agents all fit.
Task requires cross-referencing multiple domains. The Expert layer is how you get "did this cross-reference between access logs and source code check out?" without cramming every data source into one prompt.
Hallucinations are costly. The Critic's weakly- adversarial stance is the architectural defence. If your task has low hallucination tolerance (security, legal, medical), the Critic pays for itself.
Supervision over collection is the goal. Slack's framing: "we're switching to supervising investigation teams, rather than doing the laborious work of gathering evidence." The Director/Critic structure produces an auditable narrative a human can supervise; bare tool-calling agents produce an event log a human has to reconstruct.

When not to reach for it¶

Task is single-domain. If there's only one data source, only one toolset, the Expert layer collapses to a single agent and the specialisation payoff disappears.
Task is short. A single-round Q+A doesn't need phase progression or a Critic; the orchestration overhead isn't earned.
Tolerance for hallucination is high. A brainstorming assistant doesn't need a Critic; the stakes don't justify the extra model call.
Tool inventory is small. If all tools fit comfortably in one prompt, specialisation is premature.

Composes with¶

patterns/hub-worker-dashboard-agent-service — productises the loop as a service, with Worker running the loop, Hub storing events + providing metrics, Dashboard letting humans supervise live.
patterns/phase-gated-investigation-progression — gates the loop's behaviour on explicit named phases.
concepts/knowledge-pyramid-model-tiering — tiers the three personas' model cost against their task cognitive load.
concepts/weakly-adversarial-critic — calibrates the Critic's stance between cooperative helper and red team.
patterns/specialized-agent-decomposition — gives the peer-Expert-agent layer its theoretical justification.

Contrasts¶

vs. patterns/coordinator-sub-reviewer-orchestration (Cloudflare AI code review) — closest architectural sibling. Cloudflare consolidates middle+apex into a single coordinator agent with a judge pass inside it; Slack separates them into Critic + Director. The Slack shape is higher-level (Director decides what to do with the Critic's output); the Cloudflare shape is simpler (coordinator is both judge and writer).
vs. patterns/multi-round-critic-quality-gate (Meta tribal knowledge) — Meta's multi-round shape has writer-critic-fixer loops with rounds measured by scoring deltas. Slack's shape has rounds measured by investigation progression, not score delta. Meta's shape is artifact production; Slack's is investigation execution.
vs. patterns/drafter-evaluator-refinement-loop (Lyft localization) — Lyft pairs drafter + evaluator in a single retry loop. Slack adds the Director-above-evaluator third layer that handles progress, not just retry.
vs. patterns/planner-coder-verifier-router-loop — same three-layer shape at the code-task altitude. The Planner/Coder/Verifier mapping to Director/Expert/Critic is nearly direct, but Slack's loop operates on security data rather than code, and the Critic's scoring is credibility- weighted rather than pass/fail.

Tradeoffs¶

Three model tiers to manage — tier changes require coordinated re-tuning across all three personas; individual model upgrades can shift the tier balance.
Critic latency — the Critic runs between every Expert→Director hop, adding model-call latency on every round. Mitigated by running Critic on a mid-tier model.
Rubric maintenance — the Critic's rubric becomes its own artifact to maintain, test, and calibrate.
Director-Critic collusion risk — if Director and Critic run on the same model family, correlated blind spots persist. Cross-family is safer when feasible.

Seen in¶

systems/slack-spear — canonical first wiki instance. 1 Director + 4 Experts (Access, Cloud, Code, Threat) + 1 Critic. Director plans + progresses phases + writes final report; Experts produce findings from their domain tool surfaces; Critic scores + condenses + detects Expert blind spots. Canonical worked example: Critic caught a credential exposure the Expert missed, Director pivoted the investigation. (Source: sources/2025-12-01-slack-streamlining-security-investigations-with-agents) Second post specifies the context plumbing underneath the loop — the three channels (Director's Journal + Critic's Review + Critic's Timeline) that carry all inter-invocation state in place of raw message history. Critic's role specified as two separate tasks (Review + Timeline), with the Review using a four-tool introspection suite and the Timeline producing a narrative-coherence-scored chronology. Disclosed operational number: 170,000 reviewed findings with 25.8% sub-plausibility rate. (Source: sources/2026-04-13-slack-managing-context-in-long-run-agentic-applications)