PATTERN

E2E test as synthetic probe¶

What this is¶

E2E test as synthetic probe is the pattern of running a small set of browser-driving end-to-end test scenarios on a scheduled cron against live production, alerting the on-call team when a scenario fails. The pattern reuses the e2e test framework (Playwright, Cypress, Selenium) as a periodic external black-box monitor rather than a per-commit regression gate. The test scenarios are scoped to critical customer journeys and treated as a new symptom source for the alerting stack, complementing trace-derived symptom-based alerts.

Why¶

Traditional defenses leave a browser-altitude blind spot:

CI/CD e2e tests run pre-deploy against newly-built code. They don't see regressions driven by external factors — CMS content drift, API-gateway contract drift, third-party outages, CDN cache issues — that only surface in live production.
Service-level monitors (SLO breaches, 5xx rates, trace- derived CBO error rates) don't see front-end interactivity failures when HTTP still returns 200 but hydration crashes or JS errors prevent interaction.
Raw HTTP synthetic probes (Prometheus Blackbox Exporter-style) probe connectivity and content but don't drive a real user journey.

The e2e test probe fills that blind spot: a real browser, driving a real user flow, against real production — will the user actually be able to check out?

Shape (Zalando instantiation)¶

Source: .

Framework: Playwright (chosen over Cypress for auto-wait / auto-retry / tracing / unified browser API / TypeScript).
Scope: three named scenarios at publication, each a CBO:
1. Home → gender page → product click
2. Catalog page → apply filter → product click
3. Product page → select size → add to cart → start checkout
Cadence: 30-minute cron job.
Rollout: email-only shadow mode for several weeks; iterate on selectors + local expect.toPass retries + :visible pseudo-classes until zero false positives; then promote to paging.
Post-promotion reality: "only paged us once, and that was during an incident where the page was actually not working." — 0 % false-positive rate.
Declared growth path: more CBOs, extension to mobile apps.

Preconditions¶

CBO catalog exists. The probe scope should align with explicitly-named critical customer journeys, not the full test matrix. At Zalando these come from the existing Operation-Based SLOs work.
Alerting infrastructure with multi-severity channels. The probe needs an email-only tier for shadow mode and a paging tier for post-validation.
Cron scheduler (Kubernetes CronJob or equivalent).
Debugging artifact capture (Playwright HTML reports, traces, videos). Without these, shadow-mode iteration is blind.

Tensions and failure modes¶

Reliability must exceed the cadence arithmetic. 95 % at 30-min cadence = ~2.4 false positives/day — too noisy for paging. Probe-grade requires ~99.9 %. See concepts/flaky-test for the full arithmetic.
Scope creep re-introduces flakiness. Adding scenarios without independent shadow-mode validation breaks the reliability budget. Discipline: patterns/scenario-minimalism-for-probe-reliability.
Selector coupling to UI refactors. Any non- data-testid selector is a ticking time bomb; broad UI redesigns will flake a probe suite that leans on CSS structure or text content.
Third-party dependency bleed. A probe that touches a third-party service inherits that service's SLO; paging the team for a third-party outage is a bug. Probe scope should be the first-party CBO surface only.
Probe as performance test. Probes are pass/fail, not latency measurements. Using probe run time as a performance signal without an explicit p95 latency threshold is unsupported.

patterns/shadow-mode-alert-before-paging — the mandatory onboarding gate for any new probe.
patterns/scenario-minimalism-for-probe-reliability — the scoping discipline that makes reliability achievable.
patterns/scheduled-cron-triggered-load-test — sibling pattern at the load-test altitude (Zalando 2021-03-01); same cron-triggered-declarative-test shape applied to performance regression rather than correctness.
Client-side black-box probe — concepts/client-side-black-box-probe is the lower- altitude sibling (HTTP / TCP / ICMP); e2e probes extend the concept to the browser / interactivity altitude.

Seen in¶

— canonical wiki instance. Zalando Payments / frontend team deploys three Playwright probe scenarios on a 30- minute cron against zalando.com, promotes to paging after multi-week shadow-mode validation, achieves 0 % false- positive rate. Motivated by a 2024 product-detail-page React-hydration regression that existing monitoring missed. Declared growth path: "more [CBOs] [...] extending this idea to our mobile apps."

concepts/end-to-end-test-probe — primary concept.
concepts/client-side-black-box-probe — sibling lower-altitude primitive.
concepts/critical-business-operation — the scoping unit.
concepts/symptom-based-alerting — the broader alerting strategy the probe feeds.
concepts/flaky-test — the pain class to manage.
concepts/shadow-mode-alert-validation — the onboarding gate.
concepts/test-reliability-through-simplification — the reliability discipline.
patterns/shadow-mode-alert-before-paging — the validation pattern.
patterns/scenario-minimalism-for-probe-reliability — the scoping pattern.
patterns/scheduled-cron-triggered-load-test — sibling at load-test altitude.
systems/playwright · systems/cypress — the candidate framework substrates.