Skip to content

CONCEPT Cited by 1 source

Per-transaction human approval for agent spend

Definition

Per-transaction human approval for agent spend is a safety posture in which an AI agent cannot complete a payment without the human user explicitly approving that specific spend before the credential (card, token, or other payment primitive) is released to the agent. Approval is a hard gate, not a notification layer: without approval, no credential is issued and no charge can occur.

Canonicalised by Stripe's 2026-04-29 launch of Link's wallet for agents:

"Your agent provides context on the transaction, so the person can understand and approve the request. In both card and machine-native flows, the payment credential can be scoped with controls like amount, currency, and merchant. … Today, each request requires the person's review before the credential is shared with your agent."

Properties

  1. Default-on. No opt-in required; the approval step is the launch default for every spend.
  2. Hard gate, not observability. Approval precedes credential issuance. Contrast with post-hoc notification patterns where the charge completes first and the user sees it later. Here, a non-response simply leaves the agent without a credential.
  3. Per-transaction, not per-session. Even within an active agent session, each separate spend requires its own approval. The user cannot approve "everything in this session" at launch.
  4. Structured context payload. The approval surface shows the agent-provided context string + machine- readable fields (merchant name, URL, amount) so the human has enough information to approve without round-tripping with the agent.
  5. Asynchronous approval surface. Approvals can be delivered via web (link.com) or the Link iOS / Android apps — not only in the agent session itself. The agent blocks until approval or rejection.
  6. Scope-fields bound into the eventually-issued credential. The approval + credential-issuance pair are tied: approving "$35 at Powdur" issues a card or SPT scoped to that specific transaction; the credential cannot be used outside that scope even if the agent somehow retains it.

Composition with scoped-credential enforcement

Per-transaction approval is one of two defence layers. The credential scope is the second:

Layer Enforces Prevents
Per-transaction approval "This spend is ok with me" Unauthorised transactions
Credential scope "Amount / currency / merchant" Misuse of an approved credential

Both layers together make agent autonomy structurally bounded rather than trust-based — an agent cannot spend outside what the human approved and cannot re-use the issued credential at a different merchant or for a different amount.

Trade-offs

Upsides

  • Structurally bounded blast radius per incident. A hallucinated agent or a prompt-injected agent can request a spend, but cannot execute one without human approval.
  • User trust bootstrap. Consumers can enable agents without first auditing the agent's behavior. Every spend is human-sighted.
  • Regulatory alignment. Matches existing strong- customer-authentication (SCA) flows where the cardholder confirms each card-not-present transaction.

Downsides

  • Approval latency tax. Every transaction requires a human in the loop; the agent's decision cadence is bounded by human response time.
  • Approval fatigue. If an agent makes many small purchases, the consumer experiences notification overload — particularly acute for use cases with many per-task spends.
  • Undermines low-friction agent autonomy. A goal of agentic commerce is to "let agents make timely, relevant recommendations" with fast close; per- transaction approval re-introduces the checkout-friction the protocol aims to remove.
  • Single-user assumption. Approval is a single-human action; shared-account / family-account / corporate- card semantics (who approves?) are unresolved at launch.

Stated roadmap (2026-04-29)

Stripe explicitly names the mitigation axis: "We're planning on expanding these controls to let people set spending limits, and choose when agents can act without additional approval." The evolution:

  • Launch — every spend requires approval.
  • Phase 2 — cap-based auto-approval: spends under a user-set threshold proceed without per-transaction approval; spends above the threshold retain the hard gate.
  • Phase 3 (implied) — autonomy tiers: some agents / merchants / contexts are pre-approved for unattended operation within a cap.

The Phase 2 shape is the agent payment budget cap primitive specialised to consumer-wallet-altitude instead of the agent-provisioning altitude where Stripe Projects already ships with a $100/month/provider default cap.

Distinction from sibling patterns

  • Budget caps ( concepts/agent-payment-budget-cap). Caps bound the aggregate agent spend over a period; per-transaction approval bounds the individual spend decision. Both can co-exist — a cap with per-transaction approval for over-cap spends is the likely Phase-2 shape.
  • Post-hoc transaction alerts. Credit-card fraud alerts notify the user after a charge; per-transaction approval blocks before the credential exists. Timing reversal; blast-radius reversal.
  • SCA (3DS). Card-network strong customer authentication requires the cardholder to confirm the transaction — but at the card-network-authorisation altitude, not the credential-issuance altitude. 3DS confirms a card the merchant has already been told about; per-transaction approval decides whether the merchant ever sees a card at all.

Distinction from one-sided human-in-the-loop patterns

The wiki's pre-existing patterns/human-in-the-loop-quality-sampling describes human review of model outputs for quality. This concept is different: the human review is of an agent's declared-intent-to-spend, not an output the agent generated. The information the human reviews (merchant, amount, context) is agent-supplied and must be treated as potentially adversarial (LLM-injection in the context field, URL spoofing in the merchant field).

What's not disclosed

  • Default TTL for unanswered approvals — how long an approval request stays pending before auto-rejecting.
  • Multi-channel delivery semantics — if the user is signed in on multiple devices, where is the approval delivered? Is it racing across surfaces?
  • Merchant-URL validation — how the approval UI protects against an agent sending a legitimate-looking merchant-url that's actually malicious.
  • context-field adversarial handling — whether the approval UI sanitises or flags suspicious context strings (e.g. LLM-injection attempts embedded in the human-readable field).
  • Approval revocability — can a consumer rescind an approval after it's granted but before the credential is used?

Seen in

Last updated · 446 distilled / 1,275 read