Skip to content

AWS 2026-02-05 Tier 1

Read original ↗

AWS: How Convera built fine-grained API authorization with Amazon Verified Permissions

Summary

AWS Architecture Blog post by the Amazon Verified Permissions and Convera teams on Convera's adoption of Amazon Verified Permissions (AVP, the managed Cedar policy engine) as the substrate for fine-grained API authorization across four usage modes at a cross-border payments platform handling billions in annual volume. Core architectural move is a Lambda authorizer in front of API Gateway that evaluates Cedar policies against an Amazon Cognito-issued JWT whose claims have been enriched at issue time by a pre-token-generation Lambda hook, producing submillisecond authorization decisions cached at the API Gateway layer. The same architecture is reused across (1) end-customer UI + API access, (2) internal customer-service apps federated from Okta, (3) machine-to-machine service calls, and (4) multi-tenant SaaS isolation via a per-tenant policy store pattern where the authorizer looks up a tenant's policy-store-id from a DynamoDB mapping and the tenant_id is propagated to backend Kubernetes pods and RDS for second-pass zero-trust re-verification at the data layer. Policy governance is owned by Convera's infosec team via a regulated IAM role; Cedar policy changes flow through a DynamoDB Streams pipeline that continuously syncs changes into AVP. Two-level cache (API Gateway authorizer-decision cache + app-level Cognito-token cache) delivers submillisecond response times while reducing operational cost. Reported outcomes: thousands of authorization requests per second with submillisecond latency, ~60% reduction in time spent on access-management tasks. Marketing-leaning AWS Architecture Blog format — architectural signal is dense (the policy-store-per-tenant tradeoff enumeration, the zero-trust re-verification step, the DynamoDB-Streams policy sync, the ID-vs-access-token separation, the multi-mode reuse of one authorizer shape) but with no latency distribution, no cost baseline, no Cedar policy volume, no incident postmortem, and no discussion of policy-store resource-quota limits.

Key takeaways

  1. One Lambda-authorizer-in-front-of-API-Gateway shape serves four usage modes without architectural divergence — customer UI + API access, internal customer-service apps, service-to-service machine-to-machine, and multi-tenant SaaS isolation all share the same JWT validation + Cedar policy evaluation flow, differing only in Cognito pool (user vs m2m vs tenant-specific), claim enrichment source, and whether policy-store lookup is static or tenant-indexed. Architectural reuse is explicitly named: "Convera didn't need to rebuild their authorization infrastructure, so they can use the same performance optimizations, security controls, and operational processes across both customer and internal user access patterns." (Source: sources/2026-02-05-aws-convera-verified-permissions-fine-grained-authorization)

  2. The pre-token-generation Lambda hook pushes authorization-relevant attributes into the JWT at issue time so the authorizer never needs a second round-trip. A pre-token-generation hook in Cognito queries RDS (customer flow) or DynamoDB (internal flow) or a user-tenant-mapping table (multi-tenant flow) and mints an enriched token carrying roles / attributes / tenant_id. The Lambda authorizer then evaluates Cedar policies against token claims only — no auxiliary lookup on the hot path.

  3. Multi-tenant isolation uses a per-tenant policy store (not tenant-scoped policies in one shared store). Convera explicitly chose patterns/per-tenant-policy-store for four reasons: low-effort per-tenant policy isolation, per-tenant schema + template customization, low-effort onboarding/offboarding, and per-tenant policy-store resource quotas. The authorizer looks up the tenant_id → policy-store-id mapping from DynamoDB per request.

  4. The backend services re-verify authorization against Verified Permissions even after API-Gateway passes the request — explicit zero-trust re-verification at the data layer. The article states: "Backend services receive the tenant_id and validate with Verified Permissions again (for zero-trust policy), creates a tenant context, and forwards to Amazon RDS. Amazon RDS is configured to accept only requests with specific tenant context and returns data specific to the requested tenant_id." Second-pass auth prevents a compromised upstream from leaking across tenants.

  5. Policy governance flows through a DynamoDB Streams pipeline. Cedar policies are stored as data, changes captured by DynamoDB Streams and continuously synced into Verified Permissions. Policy authorship is gated by a strictly-regulated IAM role owned by Convera's infosec team. This separates policy-editing authority (IAM-bounded, infosec-owned) from the read path (authorizer-bounded, app-owned), implementing concepts/policy-as-data with a clear write-side vs read-side split.

  6. Submillisecond latency comes from a two-level cache, not from AVP alone. API Gateway caches the authorizer's IAM decision per token + route, and the application caches Cognito tokens. AVP provides the decision; the two caches provide the latency. The article is explicit: "This multi-level caching approach successfully delivers sub-millisecond response times while reducing operational costs and maintaining security controls."

  7. The UI-level policy MUST be re-evaluated at the API level. The article calls out a subtle but load-bearing correctness property: the same Cedar policy that gates showing a "Transfer" button in the UI must also gate the actual transfer API call — otherwise a malicious client can skip the UI check. UI evaluation is a UX primitive; API evaluation is the enforcement. Same policy, two call sites, one source of truth.

  8. ID token vs access token separation matters for internal-IdP-federated users. In the internal-customer flow, users authenticate through Okta via the Convera Connect app and Cognito issues both an ID token and an access token. Cognito's pre-token-generation hook customizes the access token with attributes from DynamoDB. Only the access token is evaluated by the authorizer — the identity assertion (ID token) is federated through Okta; the authorization grant (access token) is issued by Cognito.

  9. Machine-to-machine reuse is a direct generalization, not a new architecture. Service-to-service auth uses Cognito's OAuth client credentials flow: each client service is registered in Convera's client-config system, AVP creates a dedicated per-service policy store, services obtain access tokens carrying service identifier / tier / allowed operations / rate limits. Same Lambda authorizer, same AVP evaluation, same decision cache — the "principal" is now a service identity, the context is "service identity + requested operation + resource context + environmental factors."

Architecture: the four flows

(1) Customer UI + API authorization

Client → Cognito (login)
       → Cognito pre-token-generation Lambda → RDS (fetch roles)
       → Enriched JWT (with roles) → Client
       → API request w/ JWT → API Gateway
       → Lambda authorizer (validate JWT, extract claims)
       → AVP (evaluate Cedar policies)
       → Lambda authorizer generates IAM Allow/Deny
       → API Gateway caches IAM policy, forwards or rejects

Two-level cache: API Gateway authorization-decision cache + app-level Cognito-token cache.

(2) Internal customer-service app (Okta-federated)

Identical flow, substituting: - IdP: Okta (via Convera Connect App) instead of direct Cognito login. - Token enrichment source: DynamoDB instead of RDS. - Policy store: different Cedar policies (view-only to basic customer profiles, narrow-field edit for contact preferences, block on sensitive financial data) but same architecture.

(3) Service-to-service (machine-to-machine)

Service A → Cognito m2m user pool (client credentials)
          → Access token (w/ service identifier, tier, allowed ops, rate limits)
          → API Gateway → Lambda authorizer
          → AVP (evaluate Cedar policies over service identity, operation, resource, env)
          → Decision cached → Forward to Service B

Per-service policy store in AVP. Same cache layer.

(4) Multi-tenant SaaS (patterns/per-tenant-policy-store)

Tenant user → Tenant-specific Cognito pool (w/ custom tenant_id attr)
            → Pre-token-generation Lambda → DynamoDB (user → tenant_id mapping)
            → JWT carries tenant_id claim
            → API Gateway → Lambda authorizer
            → DynamoDB lookup: tenant_id → policy-store-id
            → AVP.IsAuthorized(policy_store_id=..., principal, action, resource, context)
            → If Allow:
                → API Gateway forwards request to backend Kubernetes pods
                  w/ tenant_id in custom header
                → Backend re-validates with AVP (zero-trust)
                → Creates tenant context, forwards to RDS
                → RDS (configured to accept only tenant-scoped requests)
                  returns tenant-specific data

The second AVP evaluation in the backend is patterns/zero-trust-re-verification. RDS-side tenant-context enforcement is the last line of data-layer defense.

Cedar policy examples

Two representative shapes from the post:

// UI/API co-evaluated — gate the "Transfer" button AND the transfer API
permit (
    principal,
    action in [MyApp::Action::"ViewTransferButton"],
    resource
) when {
    principal.role == "PAYMENT_INITIATOR" &&
    resource.accountType == "BUSINESS" &&
    resource.status == "ACTIVE"
};

// Multi-tenant: group membership drives principal scope
permit (
    principal in
        convera_connect_authz::userGroup::"ConveraConnect-PAYEE_MGMT",
    action in [convera_connect_authz::Action::"PUT /customer/user/{id}"],
    resource
);

// Multi-tenant: role substring + path match drives action scope
permit (
    principal,
    action in [convera_connect_authz::Action::"EDIT"],
    resource
)
when {
    principal.role.contains("UPDATE_USER_STATUS") &&
    resource.type == "PUT" &&
    resource.path == "/customers/user"
};

Cedar as a language is not introduced in this post — it is assumed as the policy surface. See systems/cedar.

Operational numbers

  • Thousands of authorization requests per second (reported; distribution not given).
  • Submillisecond latency (with the two-level cache; AVP alone is described only as "millisecond-level").
  • ~60% reduction in time spent on access-management tasks (reported business-productivity metric, methodology not shown).
  • 403 (HTTP unauthorized) on deny.
  • No cold-start / tail-latency distribution given.
  • No policy-volume or per-tenant-policy-count number given.

Why Verified Permissions (not build-in-house)

Per the article, Convera explored building an in-house access-control system and rejected it because policy management + real-time authorization + logging + auditing would require significant engineering effort ongoingly (not just upfront). Verified Permissions was selected for:

  • Direct integration with Cognito + API Gateway (the existing AuthN + ingress substrate).
  • Cedar policy language flexibility for complex rules.
  • Multi-attribute evaluation (roles, transaction amounts, geographic locations).
  • Millisecond-level decisions.

Caveats

  • Marketing-leaning AWS Architecture Blog format. Written jointly with Convera but reads as a reference-architecture writeup, not an incident-debugging retrospective. Numbers are reported without distribution or baseline.
  • No latency distribution. "Submillisecond" is the only number given; no p50/p90/p99, no cold-start behaviour, no authorizer-cache hit ratio.
  • No policy-volume numbers. How many Cedar policies per store, how many stores in the per-tenant model, how quickly DynamoDB Streams propagate policy changes — all unstated.
  • Policy-store resource quotas are named as a benefit of per-tenant-policy-store but never enumerated. The actual quota values and what happens at quota are not discussed.
  • No failure-mode discussion. What happens when AVP is unreachable? What does the authorizer default to? Is the cache fail-open or fail-closed? Not addressed.
  • "Custom tenant_id" is a Cognito convention, not a Verified Permissions construct. The mapping is application-level via DynamoDB; AVP does not natively know about tenants beyond the policy-store boundary.

Raw article

raw/aws/2026-02-05-how-convera-built-fine-grained-api-authorization-with-amazon-60f3494d.md

Original URL

Last updated · 200 distilled / 1,178 read