CONCEPT Cited by 1 source
jsonschema config validation¶
Definition¶
jsonschema config validation is the practice of declaring a single JSON Schema as the authoritative description of a configuration file's shape, and running that schema against the file at every point it gets edited or loaded — the developer's IDE, a local pre-commit hook, the CI/CD pipeline, and optionally at application startup. One schema, multiple enforcement points, no drift between the shape reviewers expect and the shape the program accepts.
For YAML configs specifically, the de-facto implementation uses JSON Schema (YAML is a JSON superset) plus the Red Hat YAML Language Server to bring schema-driven autocomplete and validation into editors.
Zalando's instantiation¶
From the 2024-01 metadpata postmortem:
*"We have set up jsonschema validation for all our configuration files. All these checks run both locally (thanks to pre-commit hooks) and in the CI/CD pipelines. We also did some small quality of life improvements to enable autocompletion and schema validation in our local IDEs, which mitigates the possibility of typos and errors and is
```
yaml-language-server: $schema=schema/config_schema.json¶
(your config) ```" — sources/2024-01-22-zalando-tale-of-metadpata-the-revenge-of-the-supertools
Three enforcement points with one schema:
- IDE —
# yaml-language-server: $schema=<path>at the top of the YAML file points the YAML Language Server at the schema. Autocomplete lists valid keys; unknown keys light up red. - Pre-commit — a local hook runs the schema check against staged YAML; the commit is refused if it fails. (systems/pre-commit)
- CI/CD — the same schema check runs server-side in the
pipeline; direct-to-
mainbypasses that skip the local hook are still caught.
Why it matters for metadpata-class bugs¶
The metadpata incident's proximate cause was a YAML field
that didn't match any declared property — metadpata is
not in the schema, metadata is. A schema validator with
additionalProperties: false at the relevant level would
have flagged the file at every enforcement point:
- IDE: red squiggle under
metadpataas the developer types. - Pre-commit:
git commitrefused with "propertymetadpatanot permitted". - CI: PR cannot merge because the pipeline fails.
Any one of the three would have prevented the incident. The
stack exists because no single enforcement point is reliably
on: IDE plugins are optional, pre-commit can be skipped with
--no-verify, CI can be disabled on a Friday-night branch
push. Three redundant enforcement points make it hard to
slip a typo through all of them accidentally.
Why one schema (not three)¶
A common anti-pattern: separate validators at each enforcement point, written in different languages or frameworks, drifting out of sync. The failure mode: an edit that passes IDE and pre-commit but fails CI (or vice versa), prompting someone to disable a check "because the checks don't agree". Using JSON Schema — a portable, language-agnostic standard with implementations in every mainstream language — keeps the checks identical.
Prerequisites¶
- A schema actually written. A validator with no schema only catches YAML syntax errors, not field-name typos.
additionalProperties: falseat the relevant object levels, to catch unknown keys. Without it, typos add new silent fields.- Team convention to include the schema comment in every YAML file, or an editor setup that associates schemas to file globs.
- Schema kept in the same repo so edits to the schema and the configs stay atomic.
Caveats¶
- Schemas can't catch semantic bugs. A
metadataobject with an emptyaccounts: []that gets interpreted as "all accounts" is a valid YAML that a schema can't reject — the supertool still needs to handle the empty-set case explicitly. Schema validation is necessary but not sufficient. - IDE validation is advisory only — developers can ignore red squiggles. Only pre-commit and CI are enforcement.
- Schema can lag the code. If the code accepts a new field before the schema declares it, the schema will reject valid configs. Schema maintenance has to be part of the feature PR.
Seen in¶
- sources/2024-01-22-zalando-tale-of-metadpata-the-revenge-of-the-supertools — coining article. Second of five remediations named in the postmortem; partners with patterns/pr-preview-of-cloudformation-changeset as the "validation tier" below the ChangeSet-preview tier.