Zalando — Maps with PostgreSQL and PostGIS¶
Summary¶
Nikolai Averkiev (Zalando Postgres Operator team, 2021-12-01)
shows how to serve geospatial data from
PostgreSQL to a browser map using three components: a
PostGIS-enabled database, a lightweight
pg_tileserv HTTP server that turns
PostGIS tables and SQL functions into
Mapbox Vector Tiles on demand, and
a Leaflet.js front-end with an
OpenStreetMap basemap. The whole
stack runs on Kubernetes, with the
database provisioned by Zalando's
Postgres Operator — one
short YAML manifest declares a 2-replica Postgres 14 cluster
with the PostGIS extension pre-installed in a named schema
via preparedDatabases. The article's architectural claim:
"these days PostGIS can take over most of the middleware's
job and produce map tiles for you" — collapsing what used to
be a bespoke tile-pipeline service into the database itself.
A second half walks through a dynamic hexagon-grid heatmap
built as a single PostGIS SQL function that reads the
Web-Mercator tile envelope, generates a hex grid via
ST_HexagonGrid, spatially joins against a Eurostat 1 km²
population grid, and returns an ST_AsMVT-encoded blob — a
worked instance of the
database-as-tile-server pattern where the rendering
pipeline is a SQL query.
Key takeaways¶
-
PostGIS replaces the tile-pipeline middleware. Until recently, serving map tiles from a spatial database required a bespoke middleware that built the tile grid, generalised geometry per zoom level, and served pre-rendered tiles. PostGIS can now produce Mapbox Vector Tiles directly via
ST_AsMVT; all that's left is a lightweight HTTP shim between the browser and the database. "You only need a lightweight server between the frontend that takes in requests from the map and sends queries to your spatial database to produce the tiles you want." (Source: this article.) This is the database-as-tile-server pattern — systems/pg-tileserv from Crunchy Data is named as such a shim. -
Tile URLs are
{BASE_URL}/tileserver/schema.table/{z}/{x}/{y}.pbf.Zis the zoom level;X/Yare tile coordinates. systems/leaflet'sVectorGridclass consumes these Protocol-Buffer-encoded tiles. Extra thematic columns beyond the geometry are packed into the MVT so the browser can restyle layers without another database round-trip. -
SQL-function tile layers as well as table layers. systems/pg-tileserv exposes not only tables but also
(z integer, x integer, y integer)-signatured PL/SQL functions that returnbytea(the raw PBF blob). The worked example generates a hexagon grid over the tile envelope viaST_HexagonGrid,LATERAL-joins it to Eurostat's 1 km² population grid transformed to WGS84, and aggregates per hex. Thesum(p.popcount) * 0.5line is flagged by the author as "oversimplified, of course" — naïve any-intersection double-counts; a centroid-point join would be more precise. (Source: this article.) -
Zoom-dependent heatmap granularity falls out for free. Because the grid is generated per-tile from the tile envelope with cell size
(ST_XMax − ST_XMin) / pow(2, step), hexagons shrink as the user zooms in. The map legend adapts its scale to the current zoom; coarse cells avoid "overwhelming the observer when the full picture is shown", fine cells guide to points of interest when zoomed in. (Source: this article.) Canonical example of zoom-dependent rendering driven by database-side computation rather than client-side filtering or pre-rendered multi-zoom tile sets. -
The full Postgres-on-Kubernetes deploy is one YAML manifest. The Postgres Operator takes a
Postgresqlcustom resource (2 replicas, Postgres 14, 10 GiB volume) with apreparedDatabasesfield that declaratively requests a database, a schema, a set of default roles (reader / writer / owner), and a list of extensions to install into a given schema (extensions: postgis: geo). The operator does the rest — StatefulSet, Services, Secrets, Patroni HA, extension installation. The image underneath is systems/spilo, Zalando's Patroni+Postgres container which ships PostGIS pre-compiled. -
pg_tileserv runs as a single Deployment, one replica, behind the Zalando oauth2-proxy. Configuration via TOML mounted from a
ConfigMap;BasePath = "/tileserver/"overridden so the service can live at/tileserverbehind the platform's auth layer.DATABASE_URLpoints at the operator-managed Service (acid-geo:5432);PGPASSWORDreads from the operator-createdmap_db_reader_usersecret. (Source: this article.) Reuses two platform-primitive pieces (auth proxy + operator-managed secrets) to keep the shim deployment-manifest ~20 lines. -
Static Leaflet app is just nginx on Ubuntu. The frontend is served from a second Deployment — a plain nginx container delivering static HTML + Leaflet JS. Divided architecture: data-plane tiles from pg_tileserv → Postgres, presentation-plane HTML/JS from nginx.
-
ogr2ogr (GDAL) is the canonical importer. Eurostat's NUTS polygons (2021) and 1 km² population grid (Geostat 2018) loaded via
ogr2ogrcommand-line. Named as the go-to tool for arbitrary geodata formats — the import-side equivalent of pg_tileserv on the export side.
Systems extracted¶
- systems/postgis — the PostgreSQL geospatial extension.
"The open-source database system PostgreSQL is used by
many teams and it offers a geospatial component called
PostGIS… widely accepted in the industry as the de facto
standard to manage geospatial data." Provides the spatial
types (
geometry,geography), spatial indexes, and the tile-production functionsST_AsMVT,ST_AsMVTGeom,ST_TileEnvelope,ST_HexagonGrid,ST_Transform,ST_Centroid. - systems/pg-tileserv — Crunchy Data's lightweight HTTP
vector-tile server for PostGIS. Exposes tables and PL/SQL
functions as
/{schema}.{object}/{z}/{x}/{y}.pbfendpoints. Configuration via TOML; deployment as a single container. - systems/leaflet — the JavaScript map library used for
the browser UI.
VectorGridclass consumes.pbftiles. - systems/openstreetmap — the basemap layer source — Zalando's "wiki-style free alternative to commercial map providers".
- systems/zalando-postgres-operator — Zalando's
open-source Kubernetes operator for Postgres. Creates
StatefulSet + Services + Secrets from a
PostgresqlCR; this article is the first wiki instance of itspreparedDatabasesfeature (declarative database + schema + roles + extensions in the CR). - systems/spilo — the Patroni+Postgres Docker image the operator uses; ships with PostGIS pre-installed.
- systems/kubernetes — the deployment substrate.
Concepts extracted¶
- concepts/vector-tiles — Mapbox Vector Tile (MVT) format; the tile grid plus per-tile Protocol Buffer blobs streamed on demand as the user pans/zooms. First wiki canonical page.
Patterns extracted¶
- patterns/database-as-tile-server-middleware-replacement — the collapse of a bespoke tile-pipeline middleware service into the spatial database itself, with only a thin HTTP shim in front. Generalises beyond PostGIS: any database with per-tile geometry generation functions can eat its tile server. First wiki canonical pattern.
- patterns/postgres-extension-over-fork — extended: PostGIS is the canonical third-party instance of this pattern (Aurora DSQL was the first). PostGIS has been developed as a Postgres extension since 2001, not as a Postgres fork.
Operational numbers and architectural disclosures¶
- Cluster shape:
numberOfInstances: 2, Postgres 14,volume.size: 10Giin thePostgresqlmanifest. Sized for the geodata demo, not production traffic. - pg_tileserv Deployment:
replicas: 1, podcontainerPort: 7800/TCP, no resources block shown ("leaving out the resources section in this example"). - Dataset: European NUTS polygons (2021 Eurostat release) — "over 100,000 polygons" on top of a basemap would slow map navigation without tile-splitting. Population grid: Geostat 2018 1 km² cells.
- Tile URL template:
${BASE_URL}/tileserver/geo.boundaries_europe/{z}/{x}/{y}.pbf—geoschema,boundaries_europetable, concrete wire format Zalando runs. - Hex-grid sizing formula:
(ST_XMax(b.geom) - ST_XMin(b.geom)) / pow(2, step)withstepdefault 4 — controls heatmap granularity per tile. Coarser at small zoom, finer at large zoom. - preparedDatabases schema/role set: operator creates
database
map_db, schemageo(fromschemas: geo: {}), extensionpostgispinned to schemageo, and default users (reader / writer / owner) whendefaultUsers: true.
Caveats and limits¶
- The pg_tileserv demo replica count is 1 — "A single
instance" is what the manifest shows. Production would
need
replicas: N+ pod-disruption budgets + the usual Deployment hardening. The article is an architecture walk-through, not a hardened production recipe. - Naïve spatial join oversimplifies population counts:
the
sum(p.popcount) * 0.5approximation flagged by the author double-counts along hex boundaries. Fix: join againstST_Centroid-derived points so each grid cell lands in exactly one hex. - No performance numbers disclosed. No QPS / latency /
percentile data on
ST_HexagonGrid+ spatial-join per tile. In practice the LATERAL join overST_HexagonGridcan become the bottleneck at small zoom levels (large tile envelopes → many hexes → larger population-grid intersections); article doesn't cover materialising a pre-computed hex grid as a table, nor tile caching in front of pg_tileserv. - No multi-region / replication / CDN layer discussed.
Deployment is single-cluster, single-replica pg_tileserv,
single-pod nginx for the Leaflet app. Tile caching
(e.g. a CDN in front of
/tileserver/…for cacheable table-layer tiles) is the obvious next step not taken here. - Auth model:
pg_tileservitself exposes all configured schema objects by default; Zalando's oauth2-proxy is the only auth layer. Fine-grained per-layer authorisation is not shown. - Not a production post: no incident retrospective, no SLO, no peak-load scenario. The piece is an architecture-tutorial by the Postgres Operator team canonicalising a stack choice, paired with a recruiting CTA for Zalando Data Engineer roles at the end.
Zalando axis placement¶
This is the tenth canonical Zalando axis on the wiki:
Postgres-on-Kubernetes / geospatial stack. It extends the
existing Postgres-on-Kubernetes / kernel-level latency
axis (opened 2020-06-23 with the PgBouncer-on-Kubernetes
post) from connection-pooling at kernel altitude to
extension-driven application-altitude use cases — the
same Postgres Operator team, now demonstrating
preparedDatabases + PostGIS as a productivity-level
developer experience on top of the same Kubernetes
substrate. Connects to the broader
Postgres extension
over fork pattern anchored on the Aurora DSQL source
from AllThingsDistributed.
Source¶
- Original: https://engineering.zalando.com/posts/2021/12/maps-with-postgresql-and-postgis.html
- Raw markdown:
raw/zalando/2021-12-01-maps-with-postgresql-and-postgis-9167896f.md
Related¶
- companies/zalando — the publisher.
- systems/postgresql · systems/postgis · systems/pg-tileserv · systems/leaflet · systems/openstreetmap · systems/zalando-postgres-operator · systems/spilo · systems/kubernetes.
- concepts/vector-tiles — the format and the zoom-dependent rendering discipline this article canonicalises.
- patterns/database-as-tile-server-middleware-replacement — the architectural claim that collapses the tile pipeline into Postgres.
- patterns/postgres-extension-over-fork — PostGIS is the canonical third-party instance.
- sources/2020-06-23-zalando-pgbouncer-on-kubernetes-minimal-latency — the prior Zalando Postgres-on-Kubernetes post from the same team, kernel-latency altitude complement.