PATTERN Cited by 1 source

Tri-mode opt-in test execution¶

Context¶

A team is adding a new test class (accessibility, visual regression, slow integration tests, flaky-but-useful checks) to an existing CI pipeline. The team wants:

Developers can run it on-demand on their branch when they need feedback.
A scheduled regression run catches drift outside the PR-per-PR cycle.
The check is optionally enabled on significant PRs where the risk of merging without it matters.

What they don't want:

The cost (time, noise, flakiness) paid by every PR at launch.
Two separate pipelines to maintain.
Branching logic scattered across N CI configs.

The pattern¶

Gate the new test class behind a single default-off environment flag (A11Y_ENABLE, VISUAL_TEST_ENABLE, etc.). Compose three execution modes from the same flag:

Mode	Activation	Trigger
On-demand local	Developer sets flag locally	Manual `A11Y_ENABLE=true pnpm test`
Scheduled nightly	Flag set in scheduled pipeline	Cron-triggered in CI
Opt-in CI gate	Flag set per-PR in CI config	Set by commit-message tag, label, or PR-template opt-in

// Single toggle at the test-framework level:
const a11yEnabled = process.env.A11Y_ENABLE === 'true';

// In the fixture helper:
async runAxeAndSaveViolations() {
  if (!a11yEnabled) return;  // no-op when disabled
  // ... actually run audit
}

In Buildkite (Slack's CI) the scheduled mode is a daily pipeline that sets A11Y_ENABLE=true and pipes the output into a Slack alert channel.

Why default-off with three modes¶

Default-off respects baseline PR time. No PR pays the cost unless opted in.
Local on-demand lets developers iterate without CI round- trips. Flakiness feedback is fastest locally.
Scheduled regression catches drift from landed PRs — the set of accessibility bugs that enter the codebase between full sweeps.
Opt-in CI gate is the hedge for high-risk PRs — where the team knows the blast radius justifies the test cost.
Single flag keeps config simple — no N-environment matrix, no pipeline variant explosion.

Graduation path¶

A new test class typically graduates through this ladder:

Local-only / on-demand — team builds and validates the checks, removes false positives.
Nightly scheduled — add daily regression run; build operational confidence in the signal.
Opt-in CI — high-risk PRs (release candidates, refactors of load-bearing code) enable the flag.
Default-on non-blocking — every PR runs the checks, violations surface but don't block merges.
Default-on blocking (scoped) — a small focused subset (e.g. Slack's planned "small blocking test suite ... dedicated to the flows of core features ... with a focus on keyboard navigation") blocks merges.

The flag lets the team progress through this ladder without structural changes — the same code, just different activation policy.

Composes with other rollout levers¶

patterns/severity-gated-violation-reporting — reduces the output volume; tri-mode reduces the invocation surface. Orthogonal.
patterns/exclusion-list-for-known-issues-and-out-of-scope-rules — reduces the audit input; tri-mode gates whether the audit runs at all. Orthogonal.

Failure modes¶

Flag drift across configs. If the flag is set in some pipeline files but not others, coverage is unpredictable. Mitigation: grep CI for the flag name during reviews.
Forgotten-to-enable. On-demand mode is self-service, but developers forget. Mitigation: editor plugin or local-dev alias that reminds them.
Scheduled run drift. If the nightly run breaks and the team doesn't notice (because it's non-blocking), the scheduled signal decays silently. Mitigation: alert on nightly pipeline failure separate from violation surfacing.

Generalisation¶

Applies to any new test class being rolled out to an existing CI pipeline:

Integration test suite that's slow but high-signal.
Visual regression snapshots that are flaky until baselined.
Load tests that are expensive to run per-PR.
Security scans that take minutes per run.
Mutation tests that are too slow for per-PR.

Canonical rule: one flag, three modes, default-off, graduation path from opt-in to default-on.

Seen in¶

sources/2025-01-07-slack-automated-accessibility-testing-at-slack — Slack's A11Y_ENABLE flag gates Axe checks. "By default, we set the flag to false, preventing unnecessary runs." Three explicit modes: on-demand local, scheduled Buildkite nightly (pipes into Slack alert channel), optional CI gate for significant changes.

systems/buildkite — the CI substrate hosting Slack's scheduled mode.
patterns/severity-gated-violation-reporting — composes.
patterns/exclusion-list-for-known-issues-and-out-of-scope-rules — composes.
patterns/a11y-checks-via-playwright-fixture-extension — the code-level integration surface the flag gates.