PATTERN Cited by 1 source
Startup-time fail-fast on config non-compliance¶
Shape¶
When a service is configured to run under a compliance or security policy that requires its deployment environment to hold a specific external state (OS-level flag, kernel feature, validated library present), the service validates the precondition at startup and refuses to start if the precondition fails. Canonical verbatim from Redpanda's FIPS implementation: "Redpanda will log an error and exit if the underlying operating system isn't properly configured." (Source: sources/2025-05-20-redpanda-implementing-fips-compliance-in-redpanda)
startup:
if policy = compliance-required:
check OS state
check validated modules present + self-tested
check config coherent (mode=enabled + required paths + OS flag)
if any check fails:
log error → exit non-zero
(do NOT continue in degraded / non-compliant mode)
else:
proceed normally
The key property: no silent downgrade. If the service is claimed to be running in compliance mode but the environment can't support it, the service is hard-down rather than silently non-compliant. This is structurally distinct from logging-then-enforcement rollouts — there is no warn-only regime for regulated workloads; the deployment either meets the boundary or it doesn't.
Why hard-fail, not soft-degrade¶
For regulatory and security-boundary use cases, a running service that believes it's compliant but isn't is worse than a not-running service:
- Audit falsifiability. A compliant-labeled cluster that silently fell through to non-approved primitives during, say, a random OS kernel crypto state glitch breaks the compliance guarantee without surfacing the failure. A hard-fail produces a pager page; a soft-fail produces an audit-time surprise.
- Blast-radius scoping. A startup-time failure is contained to one broker (or node, or cluster) going hard-down on config. A runtime compliance breach in a live system may have already handled sensitive data through the breach.
- Trust asymmetry. The cost of a false-negative (non-compliant running and believed compliant) is catastrophically higher than the cost of a false-positive (compliant-capable but refusing to start). The pattern optimises for the catastrophic-cost side.
Canonical instance: Redpanda FIPS startup¶
Redpanda broker startup path when fips_mode: enabled:
- Read
fips_modefromredpanda.yaml. - Load the FIPS OpenSSL module from
openssl_module_directory, run power-on self-tests on the validated cryptographic module. - Check OS-level FIPS state (on RHEL: read from
/proc/sys/crypto/fips_enabledor equivalent). - If any check fails → log error → exit non-zero → systemd restart loop fires pages.
- Otherwise → bind ports and begin serving Kafka protocol.
The broker refuses to hold a half-valid state — the config asserts FIPS compliance; any component that can't satisfy the assertion is a startup failure, not a degradation.
The three-state fips_mode dial
lets operators pick between:
disabled— no startup check fires.enabled— startup check fires; fails-fast on OS misconfig.permissive— partial startup check fires (module-layer only); OS layer skipped, non-production.
When to apply the pattern¶
- Regulatory / compliance boundaries (FIPS, HIPAA, PCI-DSS, digital-sovereignty regimes) that require the runtime to hold a specific state.
- Security gates where a broken check produces a confidentiality / integrity violation rather than a latency regression (e.g., signing key present, TLS cert valid, tenant isolation capability enabled).
- Capacity / resource preconditions where partial resource availability is more dangerous than unavailability (e.g., some data-directory filesystems present but not all — serving with half the partitions is worse than not serving).
When NOT to apply¶
- Non-regulatory capacity degradation. Don't fail-fast a serving cluster when a cache warm-up is slow or a non-critical sidecar is unavailable; prefer graceful degradation.
- Soft compliance claims where a warn-only audit regime is the product. Use logging-then-enforcement instead.
- Bootstrap scenarios where the precondition is expected to be absent on the first-ever start (e.g., secrets not yet provisioned). Gate on environment signal, not bare existence.
Composition¶
- With FIPS mode tri-state: the
enabledstate activates startup fail-fast; thedisabledandpermissivestates bypass it partially or entirely. - With validated cryptographic modules: the module's own power-on self-tests compose into the broker's startup check — a failed module test propagates to a failed broker startup.
- With process supervisors (systemd, runit, Kubernetes liveness probes): the non-zero exit triggers restart loops, eventually CrashLoopBackOff / degraded-unit state, visible in monitoring.
Seen in¶
- sources/2025-05-20-redpanda-implementing-fips-compliance-in-redpanda
— canonical instance. Redpanda broker "will log an error and
exit if the underlying operating system isn't properly
configured" when
fips_mode: enabledis set.
Related¶
- concepts/fips-cryptographic-boundary — the compliance primitive the pattern enforces.
- concepts/fips-mode-tri-state — the config-dial that selects whether the pattern fires.
- concepts/fips-140-validated-cryptographic-module — the substrate whose power-on self-tests compose into startup.
- concepts/logging-vs-enforcement-mode — the contrasting logging-first enforcement shape for progressive rollout.
- patterns/logging-mode-to-enforcement-mode-rollout — the rollout shape with a temporal warn-only phase, structurally distinct from this pattern's "no warn-only regime by design."
- systems/redpanda — canonical wiki instance.