CONCEPT Cited by 1 source
Dependency count by language ecosystem¶
Definition¶
Dependency count by language ecosystem is the empirical observation that the number of transitive dependencies per application varies by 1–2 orders of magnitude across language ecosystems, with the distribution typically being exponential across application-popularity percentiles within a single ecosystem.
The concept matters because it drives concrete operational costs: build times, container-image sizes, CVE exposure surface area, SBOM processing cost, and patch-cycle burden all scale with dependency count.
Canonical empirical observation (Zalando 2023-04-12)¶
Zalando publishes the following fleet-wide distribution across its polyglot application portfolio (Source: sources/2023-04-12-zalando-how-software-bill-of-materials-change-the-dependency-game):
| Language | Relative median dep count (Python = 1×) |
|---|---|
| Python | 1× (lowest) |
| Go | 1.4 – 2× |
| Java / Kotlin / Scala | 2 – 3× Go (≈ 3 – 6× Python) |
| JavaScript / TypeScript | 5 – 10× Java (≈ 15 – 60× Python) |
Notes on the Zalando-specific measurement:
- The SBOM scanner detects java-archives — it does not separate Java / Kotlin / Scala, so all three roll up.
- Python is the lightest — likely because the Python
standard library is rich and the ecosystem leans on a
smaller core of big-hitter packages (
requests,numpy,pandas, etc.) rather than deeply nested trees of small-purpose packages. - Go is close to Python — go.mod's strict vendoring discourages tiny packages, and Go's standard library also covers a lot of ground.
- Java is in the middle — rich ecosystem, but per- dependency footprint tends to be larger (many deps, not a huge number of them).
- JavaScript / TypeScript is the outlier — the npm
ecosystem has famously deep trees ("left-pad", 15 deps
for every
requireof an opinionated logger).
Named outliers in the Zalando fleet¶
- Python — jupyter at 2.5× the dependency count
of the next-biggest Python application. Expected: Jupyter
pulls in
ipython,ipykernel,notebook,tornado,jinja2,zmq, plus a wide tail of display / mime / kernel extensions. - Java — tableau (the embedded Tableau Java SDK, not Tableau the product) at 3.14× the next-biggest Java application. The author doesn't elaborate; the guess is that Tableau's embedded SDK bundles a lot of rendering / data-source connector libraries.
The growth across application-popularity percentiles is described as exponential, not linear — there's a long fat tail of applications with dramatically more dependencies than the median.
Why the ratios matter¶
Higher dependency count means:
- Larger attack surface — each dependency is a potential CVE entry. Log4Shell-class events scale with how many apps transitively link the affected library.
- More frequent patch cycles — more deps = more version-update PRs per unit time. Counter-pattern: patterns/dependency-update-discipline.
- Bigger images — directly translates to container size, which affects cold-start, pull time, registry egress cost. Zalando's AWS SDK full-vs-modules example (patterns/sbom-driven-dependency-bloat-audit) is a concrete instance: full AWS SDK is 200 MB+ in Java, modular imports drop that to tens of MB.
- Higher SBOM ingestion cost — the fleet-wide SBOM corpus (patterns/sbom-as-queryable-data-lake-asset) grows linearly with deps × apps. JavaScript apps dominate the corpus by row count.
- More concepts/transitive-dependency-reachability edges to audit — finding the one edge that pulls in a heavy subgraph gets harder as the graph grows.
Language-choice trade-off¶
The ratio is one data point in the per-language trade-off calculus. It doesn't argue "use Python for everything" — Python has its own trade-offs (runtime performance, type discipline, cold-start). But for a dependency-bloat-sensitive workload (e.g. Lambda cold-start, mobile app size, AI/ML serving container), language ecosystem is one of the highest-impact knobs available.
Zalando doesn't editorialise the ratio as a language-choice recommendation; the data is presented descriptively as "how hungry each language ecosystem is for dependencies."
Seen in¶
- sources/2023-04-12-zalando-how-software-bill-of-materials-change-the-dependency-game — canonical wiki instance. Zalando's fleet-wide measurement across Python / Go / Java / JavaScript applications, with jupyter and tableau as named outliers.
Related¶
- concepts/sbom-software-bill-of-materials — the substrate that makes cross-language measurement possible.
- concepts/transitive-dependency-reachability — the per-binary altitude where dep count manifests as binary size.
- patterns/sbom-driven-dependency-bloat-audit — the operational pattern that uses this distribution to find outlier apps worth remediating.