PATTERN Cited by 1 source

Build-time tech-debt detection¶

Definition¶

Run static-analysis rules in every CI build of every repo in the fleet, treating rule violations not (only) as build-failure gates but as measurable, queryable, prioritizable signals about technical debt across the codebase. The dashboard of rule violations becomes the canonical map of what's broken, where, and how badly.

The wiki's first canonical instance is Netflix's Nebula ArchRules deployment — 358 rules × 5,000 repos × ~1M issues — "allow[ing] us to quickly gain insight into our large fleet of microservices, and identify the areas carrying the most critical technical debt."

When to use¶

You have enough scale that ad-hoc tech-debt tracking (Jira tickets, retrospective lists) misses too much.
The tech debt has machine-checkable signatures — deprecated-API usage, security-CVE callsites, prohibited-library imports, naming-convention violations.
You have CI infrastructure that can run static analysis on every build.
You have dashboard infrastructure to aggregate violations across repos.

The pattern¶

Rules emit measurable signals¶

Each rule produces structured violation data:

Rule identifier
Severity (Low / Medium / High)
Repo + class + method + line
Plain-English description of the violation
Pointer to the relevant code

"Note that failure details feature a detailed plain English description, along with a pointer to the exact line of code in violation." — sources/2026-05-08-netflix-scaling-archunit-with-nebula-archrules

Build outcomes are tiered¶

High-priority rules → build fails (in Netflix's case, configurable per-repo via failure-threshold).
Medium / Low rules → reported but don't fail builds.

"Other customizations include disabling running rules on certain source sets and configuring the failure threshold (i.e., high priority failures will cause the build to fail)." — sources/2026-05-08-netflix-scaling-archunit-with-nebula-archrules

Dashboard aggregates the signal¶

Per-CI-build violation data flows to a central dashboard. The dashboard answers:

Per-rule: how many repos violate this rule? How many total violations? What's the trend?
Per-repo: what rules does this repo violate?
Per-team: aggregating per-repo signals up the ownership hierarchy.
Per-rule-priority: how many High-priority issues fleet-wide?

Operators read the dashboard, prioritize cleanup¶

"Being able to run these rules on this scale allows us to quickly gain insight into our large fleet of microservices, and identify the areas carrying the most critical technical debt. This makes it easier to focus and prioritize our efforts." — sources/2026-05-08-netflix-scaling-archunit-with-nebula-archrules

The dashboard is the prioritization input: rules with the most violations + the most affected repos + the highest severity get attention first.

Operational shape¶

Per the Netflix instance:

Metric	Value
Total rules	358
Repos enforcing	5,000+
Total issues	~1,000,000
High-priority issues	~1,000 (~0.1%)
Avg issues per repo	~200
Build-failing	~0.1% of issues (high-priority)
Reportable	~99.9% of issues

The 0.1%-build-fail / 99.9%-report split is the load-bearing ratio: build-failure is reserved for the most urgent issues, so that engineers don't get fatigued. The bulk of issues are measured, not enforced — visible on the dashboard but not blocking work.

The forcing function for adoption¶

Build-time tech-debt detection only works if engineers can't ignore the dashboard. Netflix's framing implies the dashboard is consulted by:

Library authors — checking who depends on their deprecated APIs (patterns/static-analysis-as-cross-repo-impact-discovery).
Platform / Infra teams — measuring fleet-wide debt trends.
Per-team owners — checking which rules their repos violate.

Without one of these forcing functions, the dashboard becomes write-only.

Distinct from CI-as-gate¶

Aspect	CI-as-gate	Build-time tech-debt detection
Goal	Block bad changes	Measure and prioritize tech debt
Failure mode	Build fails	Dashboard updates
Severity	Binary (pass/fail)	Tiered (priority-by-priority)
Visibility	Per-PR	Fleet-wide
Time horizon	Per-commit	Trend over weeks/months

The two coexist — high-priority rules use the CI-as-gate model; the bulk of rules use the dashboard model.

Distinct from runtime APM¶

Aspect	Runtime APM	Build-time tech-debt detection
What's measured	Latency, errors, throughput	Code-level signals
Cost	Always-on instrumentation	Build-time only
Visibility into unexercised paths	None	Full
Connection to code	Indirect (via traces)	Direct (rule + line)

APM tells you what's slow / failing in production; build-time tech-debt detection tells you what's structurally wrong in the codebase. Both feed prioritization but at different layers.

Adjacent patterns¶

patterns/centralized-fleet-wide-rule-catalog — the rule- distribution pattern this pattern is the use case for.
patterns/bundled-rules-auto-scoped-to-library-consumers — the substrate that makes scoping rules to specific libraries practical.
patterns/static-analysis-as-cross-repo-impact-discovery — the API-surface-discovery use case this pattern enables.
patterns/api-stability-annotations — the lifecycle-marking discipline build-time tech-debt detection enforces.

Hard problems¶

Severity calibration. Tagging a rule High vs Medium is a value judgment; over-tagging High erodes signal, under- tagging hides urgent debt.
False-positive tax. A noisy rule produces too many violations to be actionable. Every rule needs precision- tuning; some rules can never be precise enough.
Dashboard fatigue. 1M issues is too many to look at; operators rely on aggregations, but aggregations can hide individual urgent issues.
Rule authoring overhead. Each new rule is a research + design + testing investment. Catalogs grow slowly.
Cross-language gaps. Bytecode-based rules cover JVM languages uniformly; for Python/Go/Rust services, a different stack is needed.

Seen in¶

sources/2026-05-08-netflix-scaling-archunit-with-nebula-archrules — first canonical wiki naming. Netflix's 358-rule × 5,000-repo × ~1M-issue deployment is the canonical instance, framed verbatim as "identify the areas carrying the most critical technical debt" via dashboard-driven prioritization.