PATTERN Cited by 1 source

Preemptive low-severity incident for potential impact¶

The pattern¶

Declare a low-severity incident (SEV4 / SEV5) before any customer impact is observed, on the basis of elevated risk from an external event — "in preparation for the worst." The declaration creates a shared coordination channel, documentation surface, and timeline in advance of potential customer harm, so if the harm materialises, incident response is already bootstrapped.

Canonical verbatim (Source: sources/2025-06-20-redpanda-behind-the-scenes-redpanda-clouds-response-to-the-gcp-outage):

"At this point, it was clear that multiple GCP services were experiencing a global outage, despite not having received support tickets from our customers or being paged by Redpanda Cloud alerts. So, in preparation for the worst, we preemptively created a low-severity incident to coordinate the response to multiple potential incidents."

The decision at 19:08 UTC — 27 minutes after being notified of the GCP outage by the GCP TAM, with no customer tickets and no internal alerts — is the load-bearing instance.

Why declare preemptively¶

Coordination surface from t=0. The incident doc, Slack channel, and commander role are ready before the first real signal arrives — no scramble.
Multi-customer / multi-signal coordination. Where a single potential incident would be handled case-by-case, an N-potential-incident scenario (one per customer or region) needs a shared context to avoid duplicate investigation.
Observability preserved as timeline. Timeline reconstruction post-incident is easier if incident data (chat log, actions taken, decisions made) was captured in real-time rather than reconstructed.
Psychological primer. On-call staff shift from routine-ops mode to incident-response mode earlier; faster responses if customer-impact does materialise.
SEV4 is cheap. Low-severity incidents don't page executives or trigger external communication; the cost of opening one is near-zero.

When to declare preemptively¶

The trigger is elevated probability of customer impact, not confirmed impact. Examples:

Cloud-provider global outage announcement. Your dependency tier is affected; downstream customer impact is plausible but not yet observed.
Third-party vendor outage of a critical-path dependency (payment gateway, identity provider, DNS).
Known-bad deployment in progress. A rollback is underway after smoke-test failure; wider impact is possible.
Observable-anomaly without confirmed user harm. Error rates up on internal metrics but no customer tickets or alerts yet.
Regional infrastructure event (power outage, natural disaster, network partition) that might degrade service.

Severity-level discipline¶

The pattern is coupled to a severity-ladder where:

SEV4 / SEV5 = low-priority, no executive escalation, no customer comms, no paging beyond on-call engineer. Cheap to open, cheap to keep open, cheap to close.
SEV3 = confirmed customer impact, investigation underway.
SEV2 = confirmed widespread impact, customer comms started.
SEV1 = full outage, all-hands response.

A preemptive SEV4 can escalate to SEV3 / SEV2 if the risk materialises. Conversely, it can close at SEV4 with no action needed if the risk dissipates — as in the 2025-06-12 Redpanda instance where the incident closed at SEV4 with no customer impact.

patterns/expiring-incident-mitigation — once in incident mode, other patterns (load shedding, failover) can be applied conditionally with auto-expiry on incident close.
patterns/proactive-customer-outreach-on-elevated-error-rate — when to upgrade from preemptive-SEV to proactive-customer contact on observable signal.
concepts/incident-mitigation-lifecycle — the preemptive SEV is the first stage of the mitigation lifecycle, before confirmation, triage, fix, and closure.

Variants¶

Watch mode — declare preemptive SEV4 but take no action; just observe and coordinate if escalation needed.
Prepositioned mitigation — declare preemptive SEV4 and pre-stage mitigations (e.g., warm up secondary regions, prepare DNS failover) to reduce latency if escalation comes.
Customer-facing preemptive status — some orgs post status-page "Investigating" entries on public dashboards during preemptive SEV4, trading transparency for potential false-alarm cost.

Anti-patterns¶

Declaring preemptive SEV3 or higher. Higher severities have escalation costs (pages, executive involvement) that are inappropriate for unconfirmed risk; false-alarm tolerance drops rapidly at SEV3+.
Failing to close preemptive SEVs. If the risk dissipates and the SEV stays open, alert fatigue sets in and the pattern loses operational discipline.
No incident command structure at SEV4. If SEV4 doesn't name a commander, the coordination value of the pattern is lost.
No post-incident review for closed preemptive SEVs. Even if impact never materialised, the data from the near-miss (which was the risk, which controls worked, which didn't) is load-bearing for future calibration.

Redpanda timeline context¶

The 19:08 UTC preemptive SEV4 was the first element of a sequenced response:

Time (UTC)	Event
18:41	GCP TAM notification
18:42	Impact assessment began
18:43	Observed degraded monitoring (third-party vendor partial outage)
19:08	Preemptive SEV4 declared
19:23	Cloud-marketplace vendor reported issues
19:41	Google identified root cause
20:26	Delayed alert notifications arrived
20:56	Proactive customer outreach began
21:38	Incident considered mitigated (severity unchanged at SEV4)

The preemptive declaration bought 58 minutes of preparation time before the first observable impact (20:26 alerts). During that window the team was organised rather than scrambling.

Caveats¶

Pattern requires a culture that doesn't penalise false alarms. If SEV4 closures without incident are seen as "crying wolf," teams will stop declaring preemptively and lose the value.
Severity taxonomy must exist. SEV4 must be well-defined; some orgs conflate all SEVs into one escalation path.
Not a substitute for good monitoring. Preemptive SEVs work best when paired with observability that will confirm real impact — otherwise the team is flying blind.
Preemptive-SEV declaration can itself be a page. If the pattern costs engineer attention during declaration, overuse is wasteful.
Customer communication policy must be explicit. Preemptive SEVs that leak to customer comms can create reputational risk for risk that never materialises.

Seen in¶

sources/2025-06-20-redpanda-behind-the-scenes-redpanda-clouds-response-to-the-gcp-outage — canonical instance: Redpanda declared preemptive SEV4 at 19:08 UTC in response to GCP's global outage, with no customer tickets and no internal alerts yet. The incident closed at SEV4 at 21:38 UTC with no observed customer impact — a successful preemptive declaration.