PATTERN Cited by 1 source
Fleet-wide methodology via CLI¶
What it is¶
Rather than documenting an operational methodology as a wiki page, a runbook, or a vendor dashboard, the organisation packages the methodology as an executable command-line utility that reads the relevant observability APIs, applies the methodology's thresholds, and emits a single consolidated report. The CLI becomes the methodology's distribution channel: every team runs the same utility and gets the same signals, eliminating per-team drift in what "healthy" means.
Canonical example¶
Zalando's systems/rds-health — an open-source Go CLI that automates the 12-golden-signals methodology (concepts/golden-signals-rds) across an entire AWS RDS fleet.
(Source: sources/2024-02-19-zalando-twelve-golden-signals.)
The utility's own description makes the pattern explicit: "a frontend for AWS APIs that simply automates analysis of discussed golden signals across your accounts and regions" with features:
- Show configuration of all AWS RDS instances and clusters.
- Check health of all AWS RDS deployments.
- Conduct capacity planning for RDS deployments.
AWS already ships the underlying observability surface (CloudWatch, Performance Insights). The gap rds-health fills is a single-invocation fleet-wide evaluation of a specific methodology — not raw data, not per-instance deep dives.
Why CLI, not dashboard / not wiki page¶
- Wiki page problem: methodology-as-documentation is read once, bookmarked, and forgotten. Teams fork thresholds in their own scripts. Drift is inevitable.
- Dashboard problem: per-team-created dashboards have the same fragmentation problem. A "canonical" dashboard built by a platform team can be copied and customised, losing the standardisation.
- CLI is executable truth: the methodology's
thresholds, signal names, and evaluation logic are all
in one repository. When the methodology updates, the
CLI updates, and every team picks it up by upgrading.
The version of the methodology in use is observable
(
--version). Customisation happens by fork or configuration, not by silent drift.
This is the operational analogue of the "tests as executable specification" insight from application code — executable artefacts don't drift the way prose does.
Open-sourcing the methodology distributes it¶
Zalando released rds-health publicly on GitHub rather than keeping it internal. This is the mirror of their pattern around Postgres Operator and Skipper — the CLI becomes a reference implementation that other organisations can adopt, contribute back to, and grow together with. The methodology itself travels further as an OSS utility than as a blog post or internal wiki.
When to apply the pattern¶
Indicators the pattern fits:
- Fleet scale — the number of instances/services/ workloads being monitored is large enough that per-instance care doesn't scale.
- Heterogeneous teams — the people running the workloads are many and organisationally dispersed.
- Existing observability surface — the raw signals are already collected by a platform (CloudWatch, Datadog, Prometheus); the gap is evaluation, not collection.
- Calibrated thresholds — the org has enough production incidents to have derived thresholds from empirical experience, not first principles.
Indicators it doesn't fit: small fleet (per-instance dashboards work); centralised ops team (the distribution problem doesn't exist); immature thresholds (codifying guesses is worse than leaving them to judgement).
Relationship to adjacent patterns¶
- patterns/dogfood-as-adoption-proof — Zalando's pattern of internal adoption proving a utility before external promotion. rds-health was presumably used internally before open-sourcing.
- patterns/unified-sre-team-over-federated — a structural sibling at the organisational altitude: one SRE team across many product teams mirrors one utility across many database owners.
Seen in¶
- sources/2024-02-19-zalando-twelve-golden-signals —
canonical instance: Zalando packages the 12-golden-
signals methodology as
rds-health, distributed as OSS atgithub.com/zalando/rds-health.
Related¶
- concepts/database-fleet-standardisation — the motivating problem
- concepts/golden-signals-rds — the specific methodology being packaged
- systems/rds-health — the instance of the pattern
- systems/aws-performance-insights — the API the utility queries
- patterns/dogfood-as-adoption-proof · patterns/unified-sre-team-over-federated
- companies/zalando