PATTERN Cited by 1 source

Startup-time fail-fast on config non-compliance¶

Shape¶

When a service is configured to run under a compliance or security policy that requires its deployment environment to hold a specific external state (OS-level flag, kernel feature, validated library present), the service validates the precondition at startup and refuses to start if the precondition fails. Canonical verbatim from Redpanda's FIPS implementation: "Redpanda will log an error and exit if the underlying operating system isn't properly configured." (Source: sources/2025-05-20-redpanda-implementing-fips-compliance-in-redpanda)

startup:
  if policy = compliance-required:
    check OS state
    check validated modules present + self-tested
    check config coherent (mode=enabled + required paths + OS flag)
    if any check fails:
      log error → exit non-zero
      (do NOT continue in degraded / non-compliant mode)
  else:
    proceed normally

The key property: no silent downgrade. If the service is claimed to be running in compliance mode but the environment can't support it, the service is hard-down rather than silently non-compliant. This is structurally distinct from logging-then-enforcement rollouts — there is no warn-only regime for regulated workloads; the deployment either meets the boundary or it doesn't.

Why hard-fail, not soft-degrade¶

For regulatory and security-boundary use cases, a running service that believes it's compliant but isn't is worse than a not-running service:

Audit falsifiability. A compliant-labeled cluster that silently fell through to non-approved primitives during, say, a random OS kernel crypto state glitch breaks the compliance guarantee without surfacing the failure. A hard-fail produces a pager page; a soft-fail produces an audit-time surprise.
Blast-radius scoping. A startup-time failure is contained to one broker (or node, or cluster) going hard-down on config. A runtime compliance breach in a live system may have already handled sensitive data through the breach.
Trust asymmetry. The cost of a false-negative (non-compliant running and believed compliant) is catastrophically higher than the cost of a false-positive (compliant-capable but refusing to start). The pattern optimises for the catastrophic-cost side.

Canonical instance: Redpanda FIPS startup¶

Redpanda broker startup path when fips_mode: enabled:

Read fips_mode from redpanda.yaml.
Load the FIPS OpenSSL module from openssl_module_directory, run power-on self-tests on the validated cryptographic module.
Check OS-level FIPS state (on RHEL: read from /proc/sys/crypto/fips_enabled or equivalent).
If any check fails → log error → exit non-zero → systemd restart loop fires pages.
Otherwise → bind ports and begin serving Kafka protocol.

The broker refuses to hold a half-valid state — the config asserts FIPS compliance; any component that can't satisfy the assertion is a startup failure, not a degradation.

The three-state fips_mode dial lets operators pick between:

disabled — no startup check fires.
enabled — startup check fires; fails-fast on OS misconfig.
permissive — partial startup check fires (module-layer only); OS layer skipped, non-production.

When to apply the pattern¶

Regulatory / compliance boundaries (FIPS, HIPAA, PCI-DSS, digital-sovereignty regimes) that require the runtime to hold a specific state.
Security gates where a broken check produces a confidentiality / integrity violation rather than a latency regression (e.g., signing key present, TLS cert valid, tenant isolation capability enabled).
Capacity / resource preconditions where partial resource availability is more dangerous than unavailability (e.g., some data-directory filesystems present but not all — serving with half the partitions is worse than not serving).

When NOT to apply¶

Non-regulatory capacity degradation. Don't fail-fast a serving cluster when a cache warm-up is slow or a non-critical sidecar is unavailable; prefer graceful degradation.
Soft compliance claims where a warn-only audit regime is the product. Use logging-then-enforcement instead.
Bootstrap scenarios where the precondition is expected to be absent on the first-ever start (e.g., secrets not yet provisioned). Gate on environment signal, not bare existence.

Composition¶

With FIPS mode tri-state: the enabled state activates startup fail-fast; the disabled and permissive states bypass it partially or entirely.
With validated cryptographic modules: the module's own power-on self-tests compose into the broker's startup check — a failed module test propagates to a failed broker startup.
With process supervisors (systemd, runit, Kubernetes liveness probes): the non-zero exit triggers restart loops, eventually CrashLoopBackOff / degraded-unit state, visible in monitoring.

Seen in¶

sources/2025-05-20-redpanda-implementing-fips-compliance-in-redpanda — canonical instance. Redpanda broker "will log an error and exit if the underlying operating system isn't properly configured" when fips_mode: enabled is set.

concepts/fips-cryptographic-boundary — the compliance primitive the pattern enforces.
concepts/fips-mode-tri-state — the config-dial that selects whether the pattern fires.
concepts/fips-140-validated-cryptographic-module — the substrate whose power-on self-tests compose into startup.
concepts/logging-vs-enforcement-mode — the contrasting logging-first enforcement shape for progressive rollout.
patterns/logging-mode-to-enforcement-mode-rollout — the rollout shape with a temporal warn-only phase, structurally distinct from this pattern's "no warn-only regime by design."
systems/redpanda — canonical wiki instance.