PATTERN Cited by 1 source

Envelope-and-verify¶

Pattern¶

For high-stakes agentic AI systems, the architecture Byron Cook prescribes is a three-part discipline:

Specify the envelope. Write down — formally, often using concepts/temporal-logic-specification — the capabilities and restrictions that bound acceptable agent behavior. "The banking agent may never share information between consumer and investment wings." "A refund may only follow a verified authentication in the same session."
Mechanically restrict the agent to the envelope. A runtime like systems/bedrock-agentcore enforces the envelope — the agent cannot take actions outside it, not merely should not. Capability restriction is code-level, not guidance-level.
Reason about the composition of agent behaviors. Use concepts/automated-reasoning to prove global invariants over the system of agents operating inside their envelopes. "Across all agents in this deployment, no credential is ever logged" / "the total committed liability never exceeds the risk cap" / "every user request is eventually answered."

We need to specify the envelopes in which the agents can operate, use a system like Bedrock AgentCore to restrict the agents to those envelopes, and then reason about the composition of their behavior to ensure that bad things don't happen and good things eventually do happen. — Byron Cook (Source: sources/2026-02-17-allthingsdistributed-byron-cook-automated-reasoning-trust-ai)

Why envelopes (and not just post-hoc filtering)¶

patterns/post-inference-verification filters agent outputs against a spec. Envelope-and-verify is stricter and complementary: it constrains the action space the agent can reach at all.

Post-inference catches a bad output before it leaves the system.
Envelope prevents the agent from taking the bad action in the first place.

Most production-grade agentic systems want both: envelope as the primary capability boundary; post-inference as the backstop on anything the envelope permitted.

Composition reasoning: the non-obvious part¶

Individual-agent envelopes are the easy bit. The hard part is reasoning about what happens when multiple agents operate in parallel and interact:

Agents may be composed sequentially, in parallel, or hierarchically (supervisor-worker).
An invariant the global system needs may not be derivable from any single agent's envelope — it's an emergent property of how the envelopes fit together.
Properties of interest are almost always temporal — safety ("bad thing never happens") and liveness ("good thing eventually happens").

This is precisely where concepts/automated-reasoning does its best work: given envelope specs for each agent and a composition structure, prove (or produce a counterexample against) the global invariant.

The poetry vs banking stakes heuristic¶

Cook's framing: not every AI system needs envelope-and-verify. The cost is real (spec-authoring, runtime enforcement overhead, proof budget).

Poetry generation: no envelope needed. Generate freely.
City zoning chatbot: some envelope ("only answer about the permits I know"); post-inference verification helpful.
Agentic banking / healthcare / regulatory: full envelope-and-verify mandatory.

Match the pattern to the stakes. "Every solution to a problem should be as simple as possible, but no simpler." (Einstein, via Cook.)

Specifying the envelope¶

Envelopes are usually temporal specs — see concepts/temporal-logic-specification. Common shapes:

Capability bounds. The agent may call tools T1, T2, T3 but not T4. (Trivially static.)
Temporal bounds. The agent may only call execute_trade after verify_user has succeeded in the same session. (Past-time temporal.)
Information-flow bounds. Data from context A never flows to context B. (The consumer/investment-wing example.)
Liveness bounds. Every user request must eventually produce a response. (Future-time temporal.)
Epistemic bounds. The agent may only issue a refund when it knows the user is the account holder. (Knowledge logic.)

The spec language needs all of these. Cook predicts customers will discover and demand this menagerie as spec-driven tools mature.

AgentCore as the enforcement substrate¶

systems/bedrock-agentcore is AWS's runtime for this pattern:

Agents are deployed into AgentCore with their envelope spec.
AgentCore mediates all tool calls, outputs, and state transitions against the envelope.
Actions outside the envelope are rejected — the agent cannot reach them regardless of its prompt or reasoning trace.

The combination with systems/bedrock-guardrails-automated-reasoning-checks gives the full stack: AgentCore enforces the capability envelope; Guardrails reasoning checks verify output correctness against the spec.

Reasoning scales with care¶

The fundamental complexity floor of concepts/automated-reasoning (NP-complete → undecidable) still applies. Practical strategies:

Coarse-grained guardrails over fine-grained per-action proofs. (Cook cites this as the S3 patterns/durability-review lesson — prefer broad mechanisms that kill whole classes of risk.)
Distributed proof search (mallob SAT, distributed SMT).
LLM-guided proof search when algorithmic search stalls.

patterns/durability-review — same philosophical shape at S3: separate risks from countermeasures, enumerate, prefer coarse guardrails. Envelope-and-verify generalizes this to agent-behavior safety.
patterns/executable-specification — ShardStore's same-language-spec approach; the development-time counterpart to runtime envelope enforcement.
patterns/post-inference-verification — output-gating counterpart.

Seen in¶

sources/2026-02-17-allthingsdistributed-byron-cook-automated-reasoning-trust-ai — Cook names the three-part discipline (specify envelope → AgentCore enforces → reason about composition) as the architecture for agentic banking and other high-stakes domains; anchors it with the poetry-vs-banking stakes heuristic.