PATTERN Cited by 1 source
Production code as submodule for simulation¶
Problem¶
A simulation / back-testing framework needs to exercise the actual production code path — not a re-implementation — so that results generalise to production behaviour and bugs caught in simulation are real bugs in the production code.
Re-implementing the production logic inside the simulator creates a second source of truth that drifts from production over time and undermines confidence that simulation wins translate into production wins. Monkey-patching at runtime is fragile and hides what's actually being tested.
Solution¶
Include the production code repository (or repositories) in the simulation project as a Git Submodule, pointing at whichever branch is under test. The simulator invokes the production code via normal imports; the submodule pin is how you switch between "production code" and "proposed Algorithm X on a branch".
To test Algorithm X:
- Create a feature branch in the production repo.
- Point the simulator's submodule at that branch.
- Run the simulation — the simulator imports and calls the alternative code via the submodule.
The submodule pin is part of the simulation configuration, so simulation runs are reproducible (the exact commit of production code is pinned) and comparable across candidates (multiple submodule pointers map to multiple candidates).
Canonical instance — Yelp Back-Testing Engine¶
Yelp's Back-Testing Engine (2026-02-02) includes the Budgeting and Billing production repositories as Git Submodules. Verbatim:
"To support accurate back-testing, our Engine uses the same code as production by including key repositories (like Budgeting and Billing) as Git Submodules. This lets us simulate current logic or proposed changes by pointing to specific Git branches. For example, to test a new budgeting algorithm, we add it on a separate branch, configure the Back-Testing Engine to use that branch, and run simulations."
The Engine's simulation loop calls into the Budgeting submodule at the beginning of each simulated day (to compute daily budget + split) and into the Billing submodule at the end of each simulated day (to compute billing from simulated outcomes). The candidate's parameters are passed through to both submodules exactly as they are in production.
Why this is more than "shared library"¶
The pattern differs from a normal library dependency in three ways that matter for simulation:
- Branch-pointer granularity — a library pinned by version can't easily test "what if we change this function?" Submodules can point at any commit, including feature branches that aren't released.
- Multiple simultaneous pointers — different simulation runs can point at different submodule commits in parallel. A library dependency can't be two versions at once.
- Tight commit-level coupling — the simulator's commit SHA
- submodule SHA together form a reproducible simulation artifact, which matters for auditing "which version of the code ran this simulation".
Operational implications¶
- The production code must be importable outside production. This forces a level of dependency hygiene (no hard-wired config paths, no prod-only database connections at import time). Yelp's post implies Budgeting and Billing are well-factored enough to import cleanly.
- The submodule has to run with simulation-sourced inputs. If the production code expects messages from Kafka, the simulation must supply Kafka-like inputs; this is a non-trivial contract.
- Schema evolution matters — if the submodule branch changes the function signature, the simulator has to be updated to match. Yelp's post mentions configuring the Engine to use the new branch; callers have to stay in sync.
- Side effects have to be neutered — production code that writes to real databases or sends real emails can't do that in simulation. Typically handled by dependency injection of a stub I/O layer.
Reported benefits (from Yelp)¶
- Faster productionization — "blurs the line between prototyping and production, streamlining our workflows." No translation from prototype to prod; the prototype is the prod branch.
- Improved collaboration — "Scientists and engineers can now work side-by-side with production code, turning experiments into reusable, production-ready artifacts, rather than disconnected notebooks."
- Early bug detection — "Running simulations across a broad set of real data helps us catch code bugs or edge cases that would be tricky to find with unit tests alone."
When it's appropriate¶
- Simulation / back-testing / benchmark harnesses where fidelity to production matters (not generic experimentation).
- Multi-repo setups where the "code under test" is a well- bounded module.
- Teams that already use branches as the unit of proposed change.
When to avoid:
- Production code has tangled side effects or tight infra coupling that makes it hard to import standalone.
- Simulations need to test many modifications simultaneously (dozens of branches) — branch management becomes the bottleneck.
- The "production code" is a monolith too large to include as a submodule.
Seen in¶
- sources/2026-02-02-yelp-back-testing-engine-ad-budget-allocation — Yelp Back-Testing Engine; canonical wiki instance.