Skip to content

PATTERN Cited by 1 source

Tri-mode opt-in test execution

Context

A team is adding a new test class (accessibility, visual regression, slow integration tests, flaky-but-useful checks) to an existing CI pipeline. The team wants:

  • Developers can run it on-demand on their branch when they need feedback.
  • A scheduled regression run catches drift outside the PR-per-PR cycle.
  • The check is optionally enabled on significant PRs where the risk of merging without it matters.

What they don't want:

  • The cost (time, noise, flakiness) paid by every PR at launch.
  • Two separate pipelines to maintain.
  • Branching logic scattered across N CI configs.

The pattern

Gate the new test class behind a single default-off environment flag (A11Y_ENABLE, VISUAL_TEST_ENABLE, etc.). Compose three execution modes from the same flag:

Mode Activation Trigger
On-demand local Developer sets flag locally Manual A11Y_ENABLE=true pnpm test
Scheduled nightly Flag set in scheduled pipeline Cron-triggered in CI
Opt-in CI gate Flag set per-PR in CI config Set by commit-message tag, label, or PR-template opt-in
// Single toggle at the test-framework level:
const a11yEnabled = process.env.A11Y_ENABLE === 'true';

// In the fixture helper:
async runAxeAndSaveViolations() {
  if (!a11yEnabled) return;  // no-op when disabled
  // ... actually run audit
}

In Buildkite (Slack's CI) the scheduled mode is a daily pipeline that sets A11Y_ENABLE=true and pipes the output into a Slack alert channel.

Why default-off with three modes

  1. Default-off respects baseline PR time. No PR pays the cost unless opted in.
  2. Local on-demand lets developers iterate without CI round- trips. Flakiness feedback is fastest locally.
  3. Scheduled regression catches drift from landed PRs — the set of accessibility bugs that enter the codebase between full sweeps.
  4. Opt-in CI gate is the hedge for high-risk PRs — where the team knows the blast radius justifies the test cost.
  5. Single flag keeps config simple — no N-environment matrix, no pipeline variant explosion.

Graduation path

A new test class typically graduates through this ladder:

  1. Local-only / on-demand — team builds and validates the checks, removes false positives.
  2. Nightly scheduled — add daily regression run; build operational confidence in the signal.
  3. Opt-in CI — high-risk PRs (release candidates, refactors of load-bearing code) enable the flag.
  4. Default-on non-blocking — every PR runs the checks, violations surface but don't block merges.
  5. Default-on blocking (scoped) — a small focused subset (e.g. Slack's planned "small blocking test suite ... dedicated to the flows of core features ... with a focus on keyboard navigation") blocks merges.

The flag lets the team progress through this ladder without structural changes — the same code, just different activation policy.

Composes with other rollout levers

Failure modes

  • Flag drift across configs. If the flag is set in some pipeline files but not others, coverage is unpredictable. Mitigation: grep CI for the flag name during reviews.
  • Forgotten-to-enable. On-demand mode is self-service, but developers forget. Mitigation: editor plugin or local-dev alias that reminds them.
  • Scheduled run drift. If the nightly run breaks and the team doesn't notice (because it's non-blocking), the scheduled signal decays silently. Mitigation: alert on nightly pipeline failure separate from violation surfacing.

Generalisation

Applies to any new test class being rolled out to an existing CI pipeline:

  • Integration test suite that's slow but high-signal.
  • Visual regression snapshots that are flaky until baselined.
  • Load tests that are expensive to run per-PR.
  • Security scans that take minutes per run.
  • Mutation tests that are too slow for per-PR.

Canonical rule: one flag, three modes, default-off, graduation path from opt-in to default-on.

Seen in

  • sources/2025-01-07-slack-automated-accessibility-testing-at-slack — Slack's A11Y_ENABLE flag gates Axe checks. "By default, we set the flag to false, preventing unnecessary runs." Three explicit modes: on-demand local, scheduled Buildkite nightly (pipes into Slack alert channel), optional CI gate for significant changes.
Last updated · 470 distilled / 1,213 read