CONCEPT Cited by 1 source

Authorization decision caching¶

What it is¶

Authorization decision caching stores the Allow / Deny outcome of an authorization check for later reuse, so subsequent requests with the same inputs skip the policy-engine round trip. Most production fine-grained-authz architectures depend on it — AVP / Cedar / OPA evaluation is "millisecond-level" by itself, but authorization is on the hot path of every API call; submillisecond is only reachable via caching.

Typical two-level shape¶

The canonical shape, used by Convera:

API-Gateway-level authorizer-decision cache. Given a token, the IAM policy returned by the Lambda authorizer is cached keyed by token (or token + route) for a configurable TTL. Subsequent requests from the same principal with the same token hit the cache — no authorizer invocation, no AVP call, no Cognito validation.
Application-level token cache. The client app caches the Cognito JWT itself so repeat logins don't re-hit Cognito.

Together: "sub-millisecond response times while reducing operational costs and maintaining security controls." (Source: sources/2026-02-05-aws-convera-verified-permissions-fine-grained-authorization)

The cache-key design¶

Cache key granularity determines what you're actually caching:

Token-keyed. One entry per token. Coarse. Decision changes when the token changes. Used in Convera.
(Token, route)-keyed. Per-endpoint decision. Finer. Needed when the same token might have different decisions on different endpoints and you want each evaluated independently.
(Principal, resource, action)-keyed. Finest. Requires the authorizer to extract resource identifiers from the request URL / body before looking up. Most invalidation work, most fidelity.

Cache-invalidation becomes policy-propagation floor¶

A subtle but load-bearing property: the cache TTL becomes the minimum time between a policy change and enforcement. If TTL = 5m and infosec just tightened a policy, it may take 5m before all live tokens see the new policy.

This interacts with concepts/token-enrichment: enriched attributes are pinned for the token lifetime, and cached decisions are pinned for the cache TTL. Layered caches → layered staleness.

What must NOT be cached¶

Decisions that depend on time-bound context. "Allow only during business hours" — the decision is wrong 10 minutes after the boundary. Either don't cache these, or include the time bucket in the cache key.
Decisions that depend on live counters. "Allow only under the user's daily transaction limit" — stale cache under-reports usage.
Decisions shortly after a known policy change. If the invalidation signal exists, flush on change; if not, accept TTL staleness.

Caveats¶

Security ≠ latency. The cache is a latency tool, not a security tool. A cache that caches "Allow" past a revocation is a security incident. TTLs must be tuned against the policy-change frequency and the severity of late propagation.
Cache-miss behavior. On miss, fall through to full evaluation. Sizing a cache-miss spike (e.g., after a mass token refresh) is capacity planning for the authorizer.
Multi-tenant isolation. Cache keys in a multi-tenant system must include the tenant dimension; cross-tenant cache hits are a leakage.

Seen in¶

sources/2026-02-05-aws-convera-verified-permissions-fine-grained-authorization — Convera's two-level cache (API Gateway authorizer-decision + application-level Cognito token) delivers submillisecond latency over AVP's millisecond-level decisions.

concepts/fine-grained-authorization — the per-request overhead caching exists to absorb.
concepts/token-enrichment — the complementary hot-path optimization; both push cost off the per-request path.
systems/amazon-api-gateway — built-in authorizer-decision cache.
patterns/lambda-authorizer — the compute the cache fronts.