Skip to content

SYSTEM Cited by 1 source

Zalando OIDC Identity Provider

What it is

The Zalando OIDC Identity Provider is Zalando's customer- facing OpenID Connect (OIDC) identity provider, operated by the Customer Authentication Experience team. It issues JWTs for customer authentication and publishes its signing-key material as a JWK Set (JWKS) at the well-known URI accounts.zalando.com/.well-known/jwk_uris.

It is the first canonical wiki instance of a customer- identity OIDC IdP disclosing its full signing-key rotation mechanism in public prose (Source: sources/2025-01-20-zalando-json-web-keys-jwk-rotating-cryptographic-keys-at-zalando).

Why it's load-bearing

Customer JWT verification across Zalando's service fleet depends on every verifier being able to resolve the kid header of any incoming token to a trusted public key. The IdP's JWKS endpoint is the federation trust anchor for the customer-identity graph — compromise of the signing private key would let an attacker "forge fake tokens … [that] could then be used to impersonate users and access sensitive data. Essentially, all tokens signed with the leaked key would become untrustworthy." This single property makes the IdP signing key a tier-4 entry in the long-lived-key risk ladder and forces regular, automated rotation as the structural defence.

Architecture

The JWKS endpoint

  • Surface: https://accounts.zalando.com/.well-known/jwk_uris.
  • Contents: a JSON array of JWKs, one per key in the IdP's current active-plus-grace-plus-retired set.
  • Per-entry fields: standard RFC 7517 JWK parameters (kty, kid, use: "sig", alg, plus public key material n/e for RSA or x/y/crv for EC).
  • Cache control: the post emphasises that "cache control headers matter!" — the HTTP response advertises a TTL that governs client-side cache lifetime; every downstream refresh-discipline (OIDC library refresh minimums, CDN cache, intermediate proxy) must be accounted for in the rotation grace period. See concepts/cache-control-aware-grace-period for full analysis.

Signing-key lifecycle

The IdP runs an automated loop implementing the six-phase rotation lifecycle over its JWKS endpoint:

  1. Generate — create a new key pair.
  2. Publish — expose the new public key on the JWKS endpoint; it is visible to clients but not yet signing.
  3. Grace — wait for all caching layers to refresh. Load- bearing knob: see concepts/cache-control-aware-grace-period.
  4. Activate — elect the new key as the active signing key.
  5. Retire — the previous key stops signing but stays published to verify outstanding tokens.
  6. Drop — remove the retired public key once no outstanding token could still be valid; drop time governed by concepts/retirement-plus-lifespan-plus-buffer-formula: drop_time = retirement_time + max_token_lifespan + safety_buffer.

The system-level pattern encoding this loop is patterns/phased-automated-jwk-rotation.

Four design principles (verbatim from the 2025-01-20 post)

  1. Automation"New keys are generated and old keys are retired automatically, eliminating manual intervention and ensuring consistency."
  2. Scheduled Rotation"Keys are rotated on a regular basis to minimize the window of vulnerability."
  3. Secure Key Management"Our keys are securely stored and managed using industry best practices to protect them from unauthorized access."
  4. Seamless Rotation"Planned rotations are transparent to clients and do not result in any kind of access revocation or token invalidation."

Principles 1 + 4 are what the lifecycle mechanism operationalises; the six-phase ordering is the minimum viable shape that makes "seamless" possible.

JWT issuance

  • Every JWT issued by the IdP carries a kid header naming the signing key. This is what makes the drop-time formula a pure calculation — verifiers (and the IdP itself) can answer "which key signed this token?" from the token header alone, without state-keeping.
  • The IdP controls exp - iat on issuance — i.e. the max_token_lifespan is a publisher-controlled parameter, not a measurement. This combined with kid-in-header means "when is it safe to drop retired key K?" is computable at retirement time.

Scope (what's publicly known)

The 2025-01-20 article discloses the mechanism — lifecycle + gates + formula + principles — but no operational numbers:

  • No JWKS max-age value.
  • No rotation cadence.
  • No absolute grace-period duration.
  • No max_token_lifespan value.
  • No traffic / rps / fleet-size framing for the JWKS endpoint.
  • No HSM / KMS specifics for private-key storage.
  • No emergency-rotation procedure.
  • No multi-region IdP topology disclosure.

The implementation is therefore known at the architectural- shape altitude but not at the capacity-planning altitude.

Relation to Zalando's broader identity stack

Zalando's customer identity surface is distinct from its service-to-service authorization surface:

  • Customer identity (this system) is OIDC/JWT-based, issuing tokens to mobile apps, web clients, and third-party integrations that need to act on a customer's behalf.
  • Service-to-service authorization runs through OPA embedded in Skipper — see the sources/2024-12-05-zalando-open-policy-agent-in-skipper-ingress deep-dive. OPA policies can consume JWT claims issued by this IdP as authorization input, but the IdP's signing-key rotation is structurally upstream of the OPA layer.

The two systems share the "platform team owns the how, app teams opt in" ownership split (concepts/platform-team-vs-application-team-split) but operate on different primitives — the IdP owns token issuance + public-key distribution + rotation, the authorization layer owns policy evaluation + decision logging.

Seen in

See also

Last updated · 550 distilled / 1,221 read