PATTERN Cited by 1 source
jsonschema-validated config at commit and CI¶
Problem¶
A YAML or JSON configuration file with a typoed field
name or an unknown key often causes silent semantic
bugs: the consuming application reads metadata as missing
when the file has metadpata, and proceeds with some
default — frequently "no entries" or "empty set", which
for destructive code paths collapses into "every entry"
(see supertool collapse-to-all).
A single enforcement point is insufficiently redundant:
- IDE-only validation — developers without the plugin, or who ignore the red squiggle, bypass it entirely.
- Pre-commit only —
git commit --no-verifybypasses it; direct pushes from web UIs bypass it. - CI only — PRs that are merged before CI runs, emergency hot-fix paths, or CI that's temporarily disabled bypass it.
Any one enforcement point can be bypassed; typos slip through.
Solution¶
Declare the config's shape once as a JSON Schema, then enforce the schema at three redundant points:
- Developer IDE — schema-driven autocompletion and live validation as the developer types. Catches typos at the earliest possible moment.
- Local pre-commit hook — refuses to produce a commit with invalid config. Catches what the IDE missed (or what developers dismissed).
- CI pipeline — the same schema check runs server-side on every push. Catches what pre-commit was bypassed on.
The load-bearing design choice is that all three enforcement points use the same JSON Schema file — portable, language-agnostic, implementations available in every mainstream language. No drift.
Zalando's canonical instance¶
From the 2024-01 metadpata postmortem:
"We have set up jsonschema validation for all our configuration files. All these checks run both locally (thanks to pre-commit hooks) and in the CI/CD pipelines. We also did some small quality of life improvements to enable autocompletion and schema validation in our local IDEs, which mitigates the possibility of typos and errors and is simple to set up:
```
yaml-language-server: $schema=schema/config_schema.json¶
(your config) ```" — sources/2024-01-22-zalando-tale-of-metadpata-the-revenge-of-the-supertools
Three enforcement points, one schema (schema/config_schema.json):
- IDE — the
# yaml-language-server:comment at the top of each YAML file points the Red Hat YAML Language Server at the schema; any LSP-aware editor (VSCode, Neovim with coc/lsp, IntelliJ, etc.) lights up with autocompletion and validation. - Pre-commit — a pre-commit hook runs the schema check against staged YAML.
- CI/CD — the same check runs in the pipeline,
uncircumventable by client-side
--no-verifyflags.
Mechanism¶
# Schema lives in the repo
schema/
config_schema.json # JSON Schema with
# additionalProperties: false
# at the right levels
# Every YAML config points at the schema
configs/
account-a.yaml # has "# yaml-language-server: $schema=..."
account-b.yaml
...
# .pre-commit-config.yaml invokes a schema checker
- repo: https://github.com/adrienverge/yamllint
hooks: [yamllint]
- repo: local
hooks:
- id: jsonschema-check
name: validate configs against schema
entry: python scripts/check_schema.py
files: ^configs/.*\.yaml$
# CI runs the same command
# .github/workflows/validate.yml
- name: validate configs
run: python scripts/check_schema.py configs/
The same script runs locally (via pre-commit) and in CI. IDE validation is handled natively by the YAML Language Server via the schema comment.
Why it would have prevented metadpata¶
The metadpata typo failed at exactly the enforcement shape
this pattern covers: a YAML key that doesn't match any
declared property (additionalProperties: false would
reject it). All three enforcement points would have caught
it:
- IDE: red squiggle under
metadpataas soon as it's typed. - Pre-commit:
git commitrefused with "propertymetadpatanot permitted". - CI: pipeline fails; PR cannot merge.
Zalando names this directly: the change "mitigates the possibility of typos and errors."
Prerequisites¶
- Schema actually written and maintained. A validator with no schema catches only YAML syntax errors.
additionalProperties: falseat the relevant object levels in the schema, to catch unknown keys. Without it, typos create new silent fields.- Every config file has the
# yaml-language-server: $schema=…comment (team convention) or editor config maps schemas to file globs. - Schema kept in the same repo so edits to the schema and configs stay atomic.
- Schema kept current with code. If the consumer accepts a field the schema doesn't declare, the schema blocks valid configs.
Caveats¶
- Schemas catch shape, not semantics. A valid-but-
dangerous config (e.g., empty
accounts: []that the supertool reads as "all accounts") is still syntactically valid. Must combine with defensive application logic and patterns/pr-preview-of-cloudformation-changeset for the "what would this actually do?" layer. - IDE enforcement is advisory. Developers can ignore red squiggles. Only pre-commit and CI are real enforcement.
- Emergency bypass. Hot-fix paths that need to land without CI still can. Lower the bypass-bar only when strictly necessary.
- Schema complexity can explode. Large schemas with conditional requirements (if/then, allOf/oneOf) become hard to read; developers stop trusting them.
Composes with¶
- patterns/pr-preview-of-cloudformation-changeset — schema checks catch typos; ChangeSet preview catches semantics. The two layer cleanly.
- patterns/phased-rollout-across-release-channels — a change that passed schema and ChangeSet still rolls out through channels.
- patterns/scream-test-before-destructive-delete — a destructive change that passed the three earlier gates still runs through the scream test.
Seen in¶
- sources/2024-01-22-zalando-tale-of-metadpata-the-revenge-of-the-supertools
— coining article. Second of five
metadpata-postmortem remediations; partners with the ChangeSet preview at the "validation layer" of the infrastructure-change stack.