CONCEPT Cited by 1 source
Confidence-tiered routing¶
Definition¶
Confidence-tiered routing is the discipline of slicing AI decision confidence into three (or more) bands, each with a distinct downstream treatment, rather than a single approve-vs-escalate threshold. Canonical shape, disclosed verbatim in the IBM + AWS KYC architecture:
| Confidence | Treatment |
|---|---|
| > 95 % | Automatic approval |
| 75 % – 95 % | Additional verification (more sub-agent calls, richer tool invocations) |
| < 75 % | Human review with comprehensive context |
(Source: sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai.)
Why three tiers, not two¶
A binary approve / escalate-to-human gate has two failure modes:
- Too-aggressive auto-approve lets medium-confidence cases through without either second-look verification or human judgment. Bad in regulated domains.
- Too-conservative escalation dumps borderline cases onto human reviewers who then spend most of their time on medium- quality decisions, diluting the compliance-specialist's attention budget and missing the truly hard cases.
The middle tier — "additional verification" — is the mechanism that keeps the medium-confidence band out of the human-review queue while refusing to ship them through un-validated. Practically, that means:
- Running a second specialised sub-agent the Supervisor initially skipped (e.g. the Fraud Detection sub-agent adds a behavioural- similarity check for a case the Document Analysis sub-agent flagged as borderline).
- Fetching additional Knowledge Base evidence with jurisdiction- specific context.
- Re-running the Identity Verification sub-agent against additional vendor APIs.
All still within the sub-5-minute budget because the additional verification itself runs parallel / async on AgentCore.
Relationship to existing patterns¶
Confidence-tiered routing is the concept that both patterns/confidence-thresholded-ai-output and patterns/low-confidence-to-human-review operationalise:
- Confidence-thresholded-AI-output is the binary case of this concept — silence-when-unsure, single threshold.
- Low-confidence-to-human-review also two-tier — routing only the bottom band to humans.
- Confidence-tiered routing generalises to N bands with N distinct downstream treatments, of which human review is one.
The 3-tier KYC split is the richest disclosed version on the wiki. Meta's RCA-system uses a 2-tier version (patterns/confidence-thresholded-ai-output); Instacart's catalog extraction uses a 2-tier version (patterns/low-confidence-to-human-review).
Design hazards¶
- Threshold calibration is ongoing. Thresholds are product policy, not model output — they have to be tuned against labelled ground-truth and the business cost of false-accept vs false-reject. Meta's confidence-thresholded-output post is explicit about this; the KYC post just asserts 95/75 without discussing calibration.
- Band drift. If the foundation model behind the sub-agent is upgraded, the meaning of its confidence score changes. Threshold-based routing becomes fragile if it's not re-validated on model rev.
- Context payload grows with band. The lower the confidence band, the more context the downstream consumer (another sub- agent, a human reviewer) needs — "comprehensive context" in the human-review case isn't free. That context packing is what systems/agentcore-memory is for in the KYC architecture.
Caveats¶
- No disclosure of threshold source. The KYC post asserts the
95 / 75–95 / <75 split as a design choice, with no stated calibration corpus, no per-jurisdiction breakdown, and no before/after regulatory-outcome data. Treat the numbers as reference-architecture rather than tuned production values.
- Confidence is composed, not atomic. In a five-sub-agent system, each sub-agent emits its own confidence; the Supervisor has to roll these up. The post doesn't disclose the composition rule (min? weighted average? lowest sub-agent wins?).
Seen in¶
- sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai — canonical >95 / 75-95 / <75 three-band disclosure.