Skip to content

ZALANDO 2021-06-30

Read original ↗

Zalando — How we use Kotlin for backend services at Zalando

Summary

Zalando Engineering post (2021-06-30) announcing that the Zalando Tech Radar has moved Kotlin from TRIAL to ADOPT, making it the 3rd JVM language supported at platform tier alongside Java and Scala. The ADOPT promotion is the outcome of a multi-year adoption curve (100+ new applications written in Kotlin in a year), driven by the Zalando Kotlin Guild (250+ members, ~10 core). The post's load-bearing content is not the language promotion itself but the backend-service stack blueprint the Guild published alongside it — the canonical default choices for a new Kotlin backend service at Zalando: Spring Boot on Kubernetes, Gradle build, Ktlint lint, OpenAPI + Zally linter for API-first, Skipper / Fabric Gateway for AuthN/AuthZ, Lettuce for Redis, JPA-or-jOOQ for RDBMS, OpenTracing with systems/opentracing-toolbox for tracing.

Key takeaways

  1. Tech Radar as language-lifecycle governance — Zalando uses a published Tech Radar (opensource.zalando.com/tech-radar) with rings (ASSESS → TRIAL → ADOPT → HOLD) to gate which technologies get central-platform support, infrastructure integration, and template projects. ADOPT status is the signal that a technology is ready for production default. Kotlin moved TRIAL → ADOPT in 2021; the post operationalises that promotion by publishing the default stack. See concepts/tech-radar-language-governance.

  2. Community-driven Guild drives the promotion — the Kotlin Guild (250+ members, ~10 core) built the documentation, coding standards, reference projects, and service templates that the ADOPT promotion requires — not a top-down platform decision. Inputs: usage frequency telemetry across the company, expert interviews, external benchmarks, and a company-wide Engineering Community survey on the final recommendations.

  3. Repository-template nudges > policy enforcement — Zalando's internal developer tooling creates new services from a template project carrying out-of-the-box configuration and platform integrations. Teams adapt it, but the template is the carrot that drives consistency without mandates — see patterns/template-project-nudges-consistency.

  4. API-first with OpenAPI + Zally MUST gate on every build — APIs are defined in OpenAPI format via Swagger, listed centrally via an API portal, and linted by Zally (Zalando's open-source API guidelines linter). Many teams configure CI so that Zally MUST validations pass on every build, and Zalando operates this as the API-first principle throughout service development.

  5. Skipper filters as the auth enforcement point — AuthN /AuthZ handled in Skipper filters at the Kubernetes ingress, via three composable options: (a) direct Skipper filter (oauthTokeninfoAnyScope, etc.), (b) Route Groups CRD, or (c) Fabric Gateway. Rationale stated directly: Skipper "is designed to handle a large number of requests and is less likely to be misconfigured than for example Spring security" — i.e. a misconfiguration argument for the choke-point gateway over per-service auth libraries.

  6. Spring Boot as the ADOPT default, Ktor the assess-ring successor — Spring Boot is chosen for Kotlin backend services at Zalando for three reasons: large adoption base, official Spring-Kotlin integration guide, and compatibility with multiple application servers + reactive programming (WebFlux). Ktor is flagged as "growing adoption, predicted to gain popularity, possibly with GraalVM" — i.e. Ktor sits in ASSESS/TRIAL rings on the internal radar and Spring Boot stays the default for now.

  7. Gradle over Maven for Kotlin stacks — Zalando standardises on Gradle for Kotlin services for three stated reasons: customisability, build performance, and Gradle's own codebase being Kotlin-first (build scripts can be written in Kotlin via the Kotlin DSL). Gradle is also the build tool for the Kotlin compiler and major Kotlin framework projects including Spring Boot.

  8. Library slate (Ktlint / kotlin-logging / Lettuce / JPA /jOOQ) as platform-default choices — for each backend concern the Guild names a default library rather than leaving it to per-team taste:

  9. Lint: Ktlint (follows official Kotlin coding conventions, integrates cleanly in Gradle).
  10. Logging: Kotlin-logging (auto-class-name, lazy message evaluation, over slf4j).
  11. Redis: Lettuce (thread-safe, reactive, bundled into spring-boot-starter-data-redis).
  12. RDBMS: spring-boot-starter-data-jpa for ORM-default work; jOOQ advised when transactions get complex (can layer on JPA; supports Postgres JSON types natively).

  13. OpenTracing as the tracing substrate, opentracing- toolbox as the integration glue — Zalando invested in OpenTracing platform-wide (see sources/2020-10-07-zalando-how-zalando-prepares-for-cyber-week); this post names opentracing-toolbox as the recommended library, with a dedicated opentracing-kotlin submodule for Spring Boot integration, justified by the role of traces in cross-service linking and automated alerting.

Stack extracted

Concern Default
Language Kotlin (ADOPT)
Framework Spring Boot (systems/spring-boot)
Orchestration Kubernetes (systems/kubernetes)
Build Gradle (Kotlin DSL)
Lint Ktlint
Logging kotlin-logging on slf4j
API contract OpenAPI / Swagger
API lint Zally (systems/zally) — MUST gates in CI
Ingress / AuthN/AuthZ Skipper (systems/skipper-proxy) + Fabric Gateway (systems/fabric-gateway-zalando)
Redis Lettuce
RDBMS Spring Data JPA; jOOQ for complex txns
Tracing OpenTracing (systems/opentracing) + opentracing-toolbox (systems/opentracing-toolbox)
Bootstrap Template project in internal developer tooling

Caveats / what is NOT in this post

  • No production numbers on Kotlin performance, memory footprint, startup, or Kotlin-vs-Java benchmarks inside Zalando — the ADOPT promotion is justified on developer experience (StackOverflow 2020 survey wanted/dreaded rankings, type inference, null safety, data classes) and internal adoption signal (100+ new apps/year), not on runtime metrics.
  • No Ktor production case study — Ktor is flagged as upcoming but no existing service is named.
  • No treatment of Kotlin coroutines vs reactive/WebFlux for async — a notable omission given both are live patterns in the Spring-Kotlin ecosystem.
  • No GraalVM native-image production numbers — referenced as a future path for Ktor + GraalVM but no current adoption reported.
  • "Most services are not directly serving customer traffic" — explicit scoping caveat: the stack recommendations target backend/internal services, not the hot-path customer-facing edge. Hot-path services may legitimately deviate.
  • No treatment of Scala-vs-Kotlin trade-offs despite both being ADOPT-ring JVM languages at Zalando. Implicit message: Kotlin is the default for new backend services; Scala presumably retained for analytics / Spark contexts.

Source

Last updated · 476 distilled / 1,218 read