Skip to content

CONCEPT Cited by 1 source

Confidence-tiered routing

Definition

Confidence-tiered routing is the discipline of slicing AI decision confidence into three (or more) bands, each with a distinct downstream treatment, rather than a single approve-vs-escalate threshold. Canonical shape, disclosed verbatim in the IBM + AWS KYC architecture:

Confidence Treatment
> 95 % Automatic approval
75 % – 95 % Additional verification (more sub-agent calls, richer tool invocations)
< 75 % Human review with comprehensive context

(Source: sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai.)

Why three tiers, not two

A binary approve / escalate-to-human gate has two failure modes:

  • Too-aggressive auto-approve lets medium-confidence cases through without either second-look verification or human judgment. Bad in regulated domains.
  • Too-conservative escalation dumps borderline cases onto human reviewers who then spend most of their time on medium- quality decisions, diluting the compliance-specialist's attention budget and missing the truly hard cases.

The middle tier — "additional verification" — is the mechanism that keeps the medium-confidence band out of the human-review queue while refusing to ship them through un-validated. Practically, that means:

  • Running a second specialised sub-agent the Supervisor initially skipped (e.g. the Fraud Detection sub-agent adds a behavioural- similarity check for a case the Document Analysis sub-agent flagged as borderline).
  • Fetching additional Knowledge Base evidence with jurisdiction- specific context.
  • Re-running the Identity Verification sub-agent against additional vendor APIs.

All still within the sub-5-minute budget because the additional verification itself runs parallel / async on AgentCore.

Relationship to existing patterns

Confidence-tiered routing is the concept that both patterns/confidence-thresholded-ai-output and patterns/low-confidence-to-human-review operationalise:

  • Confidence-thresholded-AI-output is the binary case of this concept — silence-when-unsure, single threshold.
  • Low-confidence-to-human-review also two-tier — routing only the bottom band to humans.
  • Confidence-tiered routing generalises to N bands with N distinct downstream treatments, of which human review is one.

The 3-tier KYC split is the richest disclosed version on the wiki. Meta's RCA-system uses a 2-tier version (patterns/confidence-thresholded-ai-output); Instacart's catalog extraction uses a 2-tier version (patterns/low-confidence-to-human-review).

Design hazards

  • Threshold calibration is ongoing. Thresholds are product policy, not model output — they have to be tuned against labelled ground-truth and the business cost of false-accept vs false-reject. Meta's confidence-thresholded-output post is explicit about this; the KYC post just asserts 95/75 without discussing calibration.
  • Band drift. If the foundation model behind the sub-agent is upgraded, the meaning of its confidence score changes. Threshold-based routing becomes fragile if it's not re-validated on model rev.
  • Context payload grows with band. The lower the confidence band, the more context the downstream consumer (another sub- agent, a human reviewer) needs — "comprehensive context" in the human-review case isn't free. That context packing is what systems/agentcore-memory is for in the KYC architecture.

Caveats

  • No disclosure of threshold source. The KYC post asserts the

    95 / 75–95 / <75 split as a design choice, with no stated calibration corpus, no per-jurisdiction breakdown, and no before/after regulatory-outcome data. Treat the numbers as reference-architecture rather than tuned production values.

  • Confidence is composed, not atomic. In a five-sub-agent system, each sub-agent emits its own confidence; the Supervisor has to roll these up. The post doesn't disclose the composition rule (min? weighted average? lowest sub-agent wins?).

Seen in

Last updated · 476 distilled / 1,218 read