CONCEPT Cited by 1 source
Firmware config persistence loss¶
Definition¶
Firmware config persistence loss is the failure mode in which firmware-level configuration settings — boot order, network-boot interface preferences, secure-boot keys, hardware tunables — are reset to vendor defaults following a firmware upgrade, even when the operator has explicitly customised them.
Verbatim disclosure (Cloudflare 2026-06-01 core boot-time post):
"Configuration settings are often reset following a UEFI firmware upgrade."
Why it happens¶
UEFI firmware updates frequently:
- Reformat NVRAM regions to accommodate new variable layouts in the upgraded firmware.
- Drop variables the new firmware doesn't recognise — forward-compatibility is rarely guaranteed.
- Reset to "factory defaults" for security hygiene reasons (don't carry stale config across firmware generations).
- Vary by OEM — some preserve more, some less, depending on vendor implementation choices.
The fleet operator usually doesn't get to choose the persistence behaviour. "Configuration settings are often reset" — the unstated subtext is "and you can't always predict which ones."
Why this is structural for fleet automation¶
Any fleet-automation pipeline that:
- declares a network-boot interface order upfront,
- sets secure-boot preferences,
- toggles hardware features (TPM, SR-IOV, VT-d),
…must contend with the fact that the very thing it's trying to accelerate (firmware upgrade) can wipe out the configuration it just set. This creates an automation invariant problem: the configuration must be re-applied after every firmware upgrade, but the firmware upgrade itself doesn't have a "please re-run my customisation" hook.
Mitigation: state validation with auto-reapply¶
Cloudflare's solution (patterns/state-validation-with-auto-reapply-and-reboot):
"To address these edge cases, we implemented a state validation step. The firmware automation now validates the configuration post-change: if it detects that settings have been modified, it re-applies the config and triggers a reboot."
Three structural pieces:
- Post-change validation — read the firmware variables after every firmware operation and compare to the desired declarative state.
- Conditional re-apply — if validation fails, apply the declared config again.
- Trigger reboot — re-applied firmware config takes effect on the next boot, so the automation triggers one explicitly.
The trade-off: "Although the first boot may take slightly longer, this change drastically reduces the time required for all future start-ups from about 20 minutes to less than a minute per subsequent boot" (Cloudflare 2026-06-01).
Generalisation: configuration-as-code at firmware altitude¶
The pattern generalises configuration-as-code to firmware. The desired state lives in a release pipeline; the automation drives the live state to match it; firmware upgrades that perturb live state are absorbed by the validation+reapply loop. Cloudflare's endpoint state:
"a single BIOS firmware image serves all SKUs, configuration updates deploy at scale through our existing release pipeline, and the entire workflow operates from iPXE."
Where else this pattern shows up¶
| Layer | Persistence-loss instance |
|---|---|
| UEFI firmware upgrades | network-boot order, secure-boot settings reset |
| Switch/router firmware upgrades | line-card configuration sometimes reset |
| BMC (Baseboard Management Controller) upgrades | IPMI users / network config sometimes reset |
| Disk firmware upgrades | smart self-test schedules reset |
| Hypervisor in-place upgrades | per-VM device assignments may reset |
| Container image upgrades | volume-less state reset by definition |
Related¶
- systems/uefi
- concepts/network-boot-interface
- patterns/state-validation-with-auto-reapply-and-reboot
- concepts/lazy-loaded-bios-data-structure — sibling failure mode on the read path
- concepts/vendor-string-mismatch-on-fleet-config — sibling failure mode on the cross-vendor identification path