CONCEPT

Experimentation Evolution Model (Fabijan et al.)¶

Definition¶

The Experimentation Evolution Model is a four-stage maturity framework for online controlled experimentation in software organisations, introduced in Fabijan et al., ICSE 2017 (The Evolution of Continuous Experimentation in Software Product Development). The stages are Crawl → Walk → Run → Fly (Zalando's 2021 retrospective uses the first three).

Each stage describes: - Scale of experimentation (#tests, teams involved). - Technical capabilities of the platform. - Statistical rigor of analyses. - Organisational maturity: who owns experiment design, how decisions interact with test results, whether an experimentation culture exists at all.

Why the model matters¶

Before this framework, engineering orgs had no shared language for describing where they were on the experimentation-maturity curve versus peers. The model lets teams locate themselves, anticipate the next set of problems (scalability → trustworthiness → advanced methods), and sequence investments in the platform accordingly.

The stages¶

Crawl — first experiments are ad-hoc, team-by-team, often manual. No central platform. Quality is unverifiable; visibility into whether teams even use A/B tests before decisions is missing. At Zalando this phase ran through 2015.
Walk — centralised platform emerges; number of A/B tests grows rapidly. Challenges shift to scalability (concurrent-test load on the analysis system) and trustworthiness (data quality, design audit, right tool for each use case). At Zalando this phase ran 2016–2020 and required the systems/apache-spark rebuild of the analysis system + concepts/sample-ratio-mismatch handling + A/B-test design audit process + concepts/quasi-experimental-methods guidelines.
Run — experimentation culture is established across the organisation. Infrastructure is scalable and trustworthy. The team can now invest in advanced methods: variance reduction, Bayesian analysis, multi-armed bandit, automated data quality indicators, OEC guidance, data-visualisation sophistication, stable-unit-assumption improvements (cross-device identity resolution). Zalando entered this phase around 2020.
Fly — (in the original paper) experimentation is autonomous across the org; thousands of tests run in parallel with full platform support and near-zero analyst intervention. Zalando does not claim Fly in the 2021 post.

Zalando's application¶

The 2021 evolution retrospective (Part 1) is structured explicitly around crawl / walk / run. Each phase names the dominant challenge and how Octopus responded (Source: ).

Seen in¶

— full crawl/walk/run walkthrough at Zalando