Skip to content

CONCEPT Cited by 1 source

Firmware config persistence loss

Definition

Firmware config persistence loss is the failure mode in which firmware-level configuration settings — boot order, network-boot interface preferences, secure-boot keys, hardware tunables — are reset to vendor defaults following a firmware upgrade, even when the operator has explicitly customised them.

Verbatim disclosure (Cloudflare 2026-06-01 core boot-time post):

"Configuration settings are often reset following a UEFI firmware upgrade."

Why it happens

UEFI firmware updates frequently:

  1. Reformat NVRAM regions to accommodate new variable layouts in the upgraded firmware.
  2. Drop variables the new firmware doesn't recognise — forward-compatibility is rarely guaranteed.
  3. Reset to "factory defaults" for security hygiene reasons (don't carry stale config across firmware generations).
  4. Vary by OEM — some preserve more, some less, depending on vendor implementation choices.

The fleet operator usually doesn't get to choose the persistence behaviour. "Configuration settings are often reset" — the unstated subtext is "and you can't always predict which ones."

Why this is structural for fleet automation

Any fleet-automation pipeline that:

  • declares a network-boot interface order upfront,
  • sets secure-boot preferences,
  • toggles hardware features (TPM, SR-IOV, VT-d),

…must contend with the fact that the very thing it's trying to accelerate (firmware upgrade) can wipe out the configuration it just set. This creates an automation invariant problem: the configuration must be re-applied after every firmware upgrade, but the firmware upgrade itself doesn't have a "please re-run my customisation" hook.

Mitigation: state validation with auto-reapply

Cloudflare's solution (patterns/state-validation-with-auto-reapply-and-reboot):

"To address these edge cases, we implemented a state validation step. The firmware automation now validates the configuration post-change: if it detects that settings have been modified, it re-applies the config and triggers a reboot."

Three structural pieces:

  1. Post-change validation — read the firmware variables after every firmware operation and compare to the desired declarative state.
  2. Conditional re-apply — if validation fails, apply the declared config again.
  3. Trigger reboot — re-applied firmware config takes effect on the next boot, so the automation triggers one explicitly.

The trade-off: "Although the first boot may take slightly longer, this change drastically reduces the time required for all future start-ups from about 20 minutes to less than a minute per subsequent boot" (Cloudflare 2026-06-01).

Generalisation: configuration-as-code at firmware altitude

The pattern generalises configuration-as-code to firmware. The desired state lives in a release pipeline; the automation drives the live state to match it; firmware upgrades that perturb live state are absorbed by the validation+reapply loop. Cloudflare's endpoint state:

"a single BIOS firmware image serves all SKUs, configuration updates deploy at scale through our existing release pipeline, and the entire workflow operates from iPXE."

Where else this pattern shows up

Layer Persistence-loss instance
UEFI firmware upgrades network-boot order, secure-boot settings reset
Switch/router firmware upgrades line-card configuration sometimes reset
BMC (Baseboard Management Controller) upgrades IPMI users / network config sometimes reset
Disk firmware upgrades smart self-test schedules reset
Hypervisor in-place upgrades per-VM device assignments may reset
Container image upgrades volume-less state reset by definition
Last updated · 542 distilled / 1,571 read