Zalando — Launching the Engineering Blog¶
Summary¶
Henning Jacobs (Zalando, 2020-06-30) documents the technical setup behind
engineering.zalando.com, Zalando's re-launched engineering blog. The
architectural substance is a concrete "reuse what you have" pattern for
serving a static site with custom-domain + SSL: rather than reach for
CloudFront (the textbook AWS S3-website
answer), Zalando reuses the Skipper Kubernetes
Ingress proxy already running in front of 140+ Kubernetes clusters,
pairing it with External DNS (auto-DNS) and
the Zalando-incubator Kubernetes
Ingress Controller for AWS (auto-ALB + ACM cert). A single Skipper
route annotation — * -> compress() -> setDynamicBackendUrl("http://<BUCKET>.s3-website.<REGION>.amazonaws.com") -> <dynamic> —
wires the Ingress to the S3 website endpoint (not the REST endpoint),
adds gzip compression that the S3 website API does not provide, and
fronts it all with the team's existing HTTP/2 + ACM TLS termination stack.
The rest of the post is a pragmatic tour of the publishing workflow:
Pelican as the Python-templated static site
generator, PR-based content review with preview URLs on the Zalando CD
platform, and a pre-commit Python linter that enforces required
frontmatter keys, correct year/month folder placement, and an explicit
tag allowlist to prevent tag drift (e.g. "Postgres" vs "PostgreSQL").
Load-test numbers: p99 60ms, p50 17ms at 50 req/s; Google PageSpeed
Insights desktop score 100/100.
Key takeaways¶
-
Skipper-as-static-site-proxy: single ingress annotation replaces CloudFront — Zalando's Skipper route
* -> compress() -> setDynamicBackendUrl("http://<BUCKET>.s3-website.<REGION>.amazonaws.com") -> <dynamic>proxies all requests to the S3 website endpoint, appliesgzipcompression at the edge (S3's website endpoint does not provide response compression), and terminates TLS with an ACM cert provisioned by the Kubernetes Ingress Controller for AWS. "I decided to not use CloudFront as all the required infrastructure for domain+SSL is already in place." This is a textbook static site via ingress proxy to S3 website — reuse existing platform infrastructure rather than onboarding a purpose-built service when the platform already solves TLS, DNS, HTTP/2, and observability. -
S3 website endpoint vs S3 REST endpoint — The
WebsiteConfigurationproperty on theAWS::S3::BucketCloudFormation resource enables the URLhttp://<BUCKET>.s3-website.<REGION>.amazonaws.com, which performs index-document rewrites (index.htmlon/) and custom error pages but only serves HTTP (no SSL) and does not let you bring a custom domain directly. Fronting it with a proxy (CloudFront or, in this case, Skipper) adds TLS + custom domain. -
Pelican chosen over other SSGs for Jinja templating + Python plugins — Given the need to customize (author titles, categorized frontmatter, allowlisted tags), the criterion was "a familiar programming language for templating and for plugins. The static site generator should generate plain HTML and not contain unnecessary features we won't use." StaticGen was used as the comparison board. Pelican + [PostCSS
-
Tailwind CSS](<../systems/postcss-tailwind.md>) for styling.
-
Git-based content workflow: PR → preview URL → merge → publish — Git-as-CMS replaces the previous CMS (which "only a limited number of people had access to" and lacked a review workflow). A
make newbootstrap script scaffolds a new post; opening a PR triggers the Zalando Continuous Delivery Platform to build (make html) and publish a preview under an authenticated URL; merging triggers re-deploy to the live S3 bucket. The CD Platform has a built-in feature to upload a directory (output/) to an S3 bucket — no custom deploy glue required. -
pre-commitPython linter enforces content invariants — rules encoded: (1) required meta keys must be present (title, summary, author names), (2) blog-post markdown files must live in the rightyear/monthfolder, (3) article tags must come from an explicit allowlist to avoid synonymous duplicates (e.g., "Postgres" vs "PostgreSQL"). "Zalando's CI/CD system automatically lints all files by executingmake lint." The linter runs withpoetry runso it has access to Pelican as a dependency. -
Tag allowlist as a curation primitive — more a content-quality convention than architecture, but worth naming: allowing any freeform tag causes synonym explosion and breaks tag-based navigation; restricting to an allowlist forces reviewers to consolidate when the taxonomy genuinely needs a new term.
-
Performance numbers (p99 60ms at 50 req/s, PageSpeed 100/100 desktop) —
vegeta attack -duration=60sagainstengineering.zalando.comat 50 req/s for 60s reportsmin 12.418ms, mean 19.751ms, 50 17.049ms, 90 25.05ms, 95 38.382ms, 99 59.958ms, max 244.094ms, 100% success, all 200s. Google PageSpeed Insights (Lighthouse) reports 100/100 for desktop. The static-site + edge-compression + HTTP/2 combination is enough. -
140+ Kubernetes clusters as the deployment denominator — Zalando had Skipper running in front of every cluster (stated in the post). The blog is just one more
Ingressresource in one of them. This is the reuse lever: if you have to maintain 140+ clusters anyway, the marginal cost of adding a blog Ingress is close to zero; standing up a new CloudFront distribution has real operational overhead (separate certs, separate monitoring, separate WAF). The architecturally honest observation is infrastructure reuse decisions are a function of what's already instrumented.
Operational numbers¶
- Fleet: 140+ Kubernetes clusters at Zalando, all with Skipper as the default Ingress proxy.
- Load test: 50 req/s × 60s = 3000 requests, 100% success, all 200 status codes.
- Latency: min 12.4ms · mean 19.8ms · p50 17.0ms · p90 25.1ms · p95 38.4ms · p99 60.0ms · max 244.1ms.
- Page weight: mean 17.1 KB per response (compressed).
- PageSpeed Insights (Lighthouse desktop): 100/100.
- TLS: TLSv1.2,
ECDHE-RSA-AES128-GCM-SHA256, ACM-issued cert. - Compression:
content-encoding: deflatereported in curl output, supplied by Skipper'scompress()filter (S3 website endpoint does not compress).
Systems and concepts surfaced¶
-
Systems — Skipper (Kubernetes Ingress proxy, Zalando-authored); Pelican (Python static site generator); External DNS (Kubernetes → DNS provider sync); Kubernetes Ingress Controller for AWS (Zalando-incubator ALB/ACM automation); AWS S3 with
WebsiteConfiguration; AWS ALB; ACM; CloudFormation; pre-commit. -
Concept — Git-based content workflow: treat the blog as a git repository; PR for review, preview URL from CD, merge to publish. Applies far beyond blogs — documentation sites, runbooks, config repos all fit.
-
Pattern — Static site via ingress proxy to S3 website: a Kubernetes-native alternative to CloudFront + S3 for shops that already run an ingress platform with TLS/DNS automation. Trades CDN edge caching for reuse of existing observability/auth/WAF stack.
Caveats¶
- No CDN means no edge caching — the traffic all hits Skipper pods
in the operator's clusters and round-trips to
s3-websitein the bucket's region. That's fine at blog-level scale (50 req/s) but would be a poor choice for a globally-distributed high-traffic property. CloudFront + S3 is still the right answer when edge caching actually matters. - S3 website endpoint is public — the Skipper route forwards
directly to the S3 website URL, which means the bucket is
AccessControl: PublicRead. This is intentional for a public blog but wouldn't work for gated content; for private static sites, CloudFront with OAC (Origin Access Control) or S3 presigned URLs are the right primitives. - Tier-2 meta-post — this is a launch/infrastructure-for-our-blog post, not a deep internals piece. The architectural takeaways are narrow but concrete. The bulk of the post is publishing workflow, Pelican customization, and Jinja templates — out of scope for the sysdesign wiki's usual depth.
Source¶
- Original: https://engineering.zalando.com/posts/2020/07/launching-the-engineering-blog.html
- Raw markdown:
raw/zalando/2020-06-30-launching-the-engineering-blog-73265da9.md