How we reduced the size of our Agent Go binaries by up to 77%¶
Datadog Engineering retrospective (2026-02-18) on how the
Datadog Agent team cut Go-binary sizes
by up to 77 % across a 6-month program (Dec 2024 → Jul 2025)
spanning versions 7.60.0 → 7.68.0 — without removing a single
feature. Linux amd64 compressed .deb package went from
265 MiB → 149 MiB (−44 %), uncompressed 1.22 GiB → 688 MiB
(−44 %); individual binaries shrank by 56-77 %. Three ingredients:
(1) systematic dependency auditing to find accidentally-pulled
transitive deps and prune them at their import sites; (2)
re-enabling the Go linker's method dead-code elimination by
patching every use of reflect.MethodByName across the codebase
+ dependencies; (3) tracing a mysterious amd64-only regression
to containerd's import of the stdlib plugin package, which
silently puts the linker into dynamic-link mode and keeps every
exported and unexported method reachable. Fixes are contributed
upstream to kubernetes/kubernetes, uber-go/dig, google/go-cmp,
containerd/containerd, and now benefit every large Go binary in
the ecosystem (Kubernetes itself reports 16-37 % reductions).
Summary¶
The Agent is a family of binaries (Core Agent, Trace Agent,
Process Agent, Security Agent, System Probe) built from a single
codebase with Go build tags +
dependency injection to select features per target (Linux / macOS /
Windows × Docker / Kubernetes / Heroku / IoT / cloud distros).
Over 5 years 2019→2024 the Linux amd64 .deb grew from
428 MiB → 1,248 MiB uncompressed (+192 %), compressed
126 MiB → 265 MiB — reflecting years of new features + hundreds of
dependencies (cloud SDKs, container runtimes, security scanners).
The growth hit real constraints in serverless, IoT, containerized
workloads and worsened perception + network + memory costs.
Datadog set out to "bend the curve" — not remove features, just stop shipping what isn't used. The method was measurement-driven:
- Map what's included.
GOOS=linux GOARCH=amd64 go list -f '{{ join .Deps "\n" }}' -tags t1,t2 ./pkg/mainlists every package the compiler selects for a given OS × arch × build-tag combo. The output is what ends up in the binary, not why — for that use goda which takes the same tags - env and builds a dependency graph, supporting a
reach(...)function that shows all paths from the main package to any target package. - Measure what each package costs.
go listsays which packages are in the binary but not how much each one takes. go-size-analyzer reads the built binary and reports per-dependency byte cost — textual or interactive web UI. "Simply importing a package has side effects: init functions run and global variables are initialized, which can be enough to force the linker to keep many unnecessary symbols." The right tool for "which dependencies are actually worth removing". -
Find pathological accidental inclusions. Concrete example: the
trace-agentbinary was supposed to be Kubernetes-free, butgo listshowed 526 packages fromk8s.io/*included andgo-size-analyzerattributed ≥30 MiB to them.goda reachtraced the entire k8s import graph back to a single function in a single package of the Agent codebase thattrace-agentimported for an unrelated reason; the function itself had no k8s dependency, but the package did. Fix: move the function into its own package, update callers. Result: 570 packages removed from the Linux trace-agent + ≥36 MiB size reduction — "more than half of the binary". Instance of patterns/single-function-forced-package-split. -
Re-enable method dead-code elimination. Go's linker can normally drop methods no code actually calls. But any use of
reflect.MethodByName(name)with a non-constantnameforces the linker to keep every exported method of every reachable type — because at build time it can't know which methods the reflection path will resolve (concepts/reflect-methodbyname-linker-pessimism). The canonical real-world offenders are the stdlib text/template and html/template packages: they execute template actions like{{.Error}}by reflecting on arbitrary values with dynamic method names. The linker's-dumpdepflag prints why each symbol is reachable; the whydeadcode tool parses that output and names the first culprit call-chain (only the first entry is guaranteed to be a true positive — iterate). "We initially assumed patching every problematic use of reflect — both in our own codebase and external dependencies — would be too difficult, we gave it a try anyway." Around a dozen upstream patches later (kubernetes/kubernetes#132177, uber-go/dig#425, google/go-cmp#373), and a fork oftext/template+html/templateintopkg/template/with method-calls statically disabled, Datadog enabled the optimization across every binary → 16-25 % size reduction per binary, ~100 MiB total. Kubernetes picked the same optimization up and reports 16-37 %. -
Trace
amd64-only oddities. When Datadog first hacked through their code — commenting out every optimization-disabler to see what was possible — they saw 94 MiB savings on Linuxarm64but almost no change onamd64.whydeadcodesaid an unexported method of an ordinary type was reachable onamd64but not onarm64, which shouldn't depend on architecture. Digging into the linker revealed thepluginbuild mode: importing the stdlibpluginpackage (source) puts the linker into dynamically linked mode, which disables method dead-code elimination and keeps every unexported method reachable — because a dynamically loaded Go plugin could call anything. goda traced thepluginimport tocontainerd/plugin/plugin_go18.go— a feature Datadog doesn't use. Datadog opened containerd/containerd#11203 to gate it behind a build tag, applied the tag in the Agent via #32538 - #32885
→ 245 MiB reduction on main Linux
amd64artifacts, ~20 % of total size, benefiting ~75 % of users. Instance of concepts/go-plugin-dynamic-linking-implication + canonical datum for patterns/upstream-the-fix.
Overall result, Linux amd64:
- Core Agent: 236 MiB → 103 MiB (−56 %)
- Process Agent: 128 MiB → 34 MiB (−74 %)
- Trace Agent: 90 MiB → 23 MiB (−74 %)
- Security Agent: 152 MiB → 35 MiB (−77 %)
- System Probe: 180 MiB → 54 MiB (−70 %)
Key takeaways¶
- Dependency audits need
goda+go-size-analyzer, not justgo list.go listtells you which packages are in the binary;godatells you why (full import path +reach()to a target);go-size-analyzertells you how much each one costs in bytes. The three together turn "we accidentally ship Kubernetes" into "there's one function in one package that pulls k8s, and moving it saves 36 MiB". (Source: sources/2026-02-18-datadog-how-we-reduced-agent-go-binaries-up-to-77-percent) - A single function can pull half your binary. The Agent trace-agent's 570-package / 36-MiB k8s accidental inclusion collapsed to one function in one package. The fix was moving the function into its own package so the rest of the original package (and all its imports) no longer gets dragged in. "This reduction is an extreme example, but was not a unique one. We found many similar cases." Transitive-dependency reachability is often one edge wide. (concepts/transitive-dependency-reachability)
reflect.MethodByNamesilently disables the linker's most powerful pruner. Method dead-code elimination — dropping methods no code calls — is only safe when the linker can statically prove every method lookup's target. Any use ofreflect.MethodByNamewith a non-constant name (the normal way to use it) defeats that proof → every exported method of every reachable type stays. Real-world impact is dominated bytext/template+html/template, both stdlib. Datadog's remedy — patch around a dozen deps + fork the stdlib templates intopkg/template/with method calls disabled — yielded 16-25 % per binary / ~100 MiB total on Linuxamd64. (concepts/reflect-methodbyname-linker-pessimism)- Import the stdlib
pluginpackage → 245 MiB cost. Simply importingplugin(even without using it) puts the Go linker in dynamically-linked mode, disabling method DCE and keeping all unexported methods.containerd/plugin/plugin_go18.goimported it for a feature Datadog didn't use;godafound it, an upstream PR gated it behind a build tag, Datadog applied the tag. 245 MiB / ~20 % / ~75 % of users benefited. Instance of concepts/go-plugin-dynamic-linking-implication. Tiny imports can have outsized costs — "simply importing a package has side effects". - The fix belongs upstream. Every substantive fix Datadog shipped — kubernetes/kubernetes, uber-go/dig, google/go-cmp, containerd/containerd — was a PR to the upstream, not a local fork or a vendor patch. Kubernetes picked up the same method-DCE optimization once Datadog cleared the trail and reports 16-37 % reductions of its own. Canonical patterns/upstream-the-fix benefit: the ecosystem bill goes down, and Datadog doesn't carry a maintenance fork.
- Hack-first measurement. The
arm64vsamd64divergence was surfaced by a "hacked through our codebase and dependencies, commenting out every piece of code that disabled the optimization" run. The binaries didn't work in that state, but they linked — which was enough to measure the upper bound and localise the delta to a specific import. Instance of patterns/measurement-driven-micro-optimization for binary-size: break the binary on purpose to bound the possible win, then do the real work to reclaim it. - Build the debugging kit first. The three tools that made
this feasible — go-size-analyzer,
goda, whydeadcode —
are all OSS, pre-existing, and rely on public Go compiler /
linker outputs (
go list,-dumpdep). The work was applying them systematically, not building new infrastructure. Anyone with a large Go binary has the same kit available.
Systems / tools surfaced¶
- systems/datadog-agent — the umbrella product: Core, Trace, Process, Security Agents + System Probe. Built from a single codebase via build tags + DI; hundreds of dependencies.
- systems/go-compiler — Go toolchain's compilation front-end; works at package granularity, selects files by build constraints, transitively adds every import it encounters.
- systems/go-linker — joins compiled package artifacts,
performs symbol reachability analysis and dead-code elimination.
Controls method DCE + respects
pluginbuild mode. - systems/goda — dependency-graph tool
(github.com/loov/goda); takes
GOOS/GOARCH/build-tags likego list, graphs imports, supportsreach(all, target)for target-reachability queries. - systems/go-size-analyzer — binary-size inspector (github.com/Zxilly/go-size-analyzer); text + interactive web UI with per-dependency byte costs.
- systems/whydeadcode — tool
(github.com/aarzilli/whydeadcode)
that consumes
go build -ldflags=-dumpdepoutput and names the call-chain disabling method DCE. Iterate until clean. - systems/containerd — container runtime; imported
pluginfor user-loadable plugins (a feature Datadog didn't use). Root cause of a 245-MiB regression until upstream gated it behind a build tag. - systems/kubernetes — surfaced twice: (1) 526
k8s.io/*packages accidentally pulled into trace-agent; (2) later adopter of the method-DCE optimization, reporting 16-37 % of its own after the Datadog-authored kubernetes/kubernetes PR landed. - systems/text-template / systems/html-template —
stdlib templating packages; canonical real-world users of
reflect.MethodByName. Datadog forked both intopkg/template/to statically disable the method-call code path.
Concepts introduced / extended¶
- concepts/binary-size-bloat (new) — growth of a compiled
artifact over time as features + dependencies accumulate, without
a commensurate removal discipline. Canonical wiki instance:
Datadog Agent
.deb428 MiB → 1,248 MiB over 2019-2024. - concepts/dead-code-elimination (new) — linker's ability
to drop symbols no reachable code uses. Method DCE specifically:
dropping methods no code calls via a typed call site. Disabled
by non-constant
reflect.MethodByNameorplugin-mode linking. - concepts/go-build-tags (new) — Go's file-level compile
guard (
//go:build tag); the compiler skips tag-excluded files so their imports never pull into the binary. - concepts/transitive-dependency-reachability (new) — the
graph-theoretic reason a package's cost depends on how it's
imported: any import of a package transitively includes every
package it imports (in files not excluded by build tags), plus
runtime. A single function's transitive import closure can be huge. - concepts/reflect-methodbyname-linker-pessimism (new) —
specific mechanism by which
reflect.MethodByName(dynamic_name)forces the linker to keep every exported method of every reachable type. Canonical real-world offenders:text/template html/template.- concepts/go-plugin-dynamic-linking-implication (new) —
importing the stdlib
pluginpackage at all puts the linker into dynamically-linked mode, disabling method DCE AND keeping every unexported method reachable. 245 MiB cost for the Agent.
Patterns introduced / extended¶
- patterns/build-tag-dependency-isolation (new) — mark a file with an unused build tag so its imports (+ their transitive deps) never end up in binaries without that tag. One of two canonical ways to prevent an unwanted dependency in Go (the other: package-split).
- patterns/single-function-forced-package-split (new) — when a single function's imports drag a whole dep stack into binaries that don't need the rest of its package, move the function into its own package. Canonical wiki instance: the Agent trace-agent's one-function-pulls-k8s case (−570 packages, −36 MiB).
- patterns/upstream-the-fix (extend) — Datadog's four upstream PRs (kubernetes/kubernetes, uber-go/dig, google/go-cmp, containerd/containerd) as a canonical binary-size / ecosystem instance; Kubernetes inherits the method-DCE win for free.
- patterns/measurement-driven-micro-optimization (extend)
— binary-size variant: profile with
go-size-analyzer→ explain withgoda reach→ hack-first bounding (comment out DCE-disablers to measure upper bound) → real fix.
Operational numbers¶
- 5-year growth: 428 → 1,248 MiB uncompressed (+192 %), 126 → 265 MiB compressed (+110 %).
- 6-month reduction program (v7.60.0 → v7.68.0, Dec 2024 → Jul 2025):
- Core Agent 236 → 103 MiB (−56 %)
- Process Agent 128 → 34 MiB (−74 %)
- Trace Agent 90 → 23 MiB (−74 %)
- Security Agent 152 → 35 MiB (−77 %)
- System Probe 180 → 54 MiB (−70 %)
- Compressed
.deb265 → 149 MiB (−44 %), uncompressed 1.22 GiB → 688 MiB (−44 %). - Single-function k8s fix: −570 packages, −36 MiB on trace-agent.
- Method-DCE enable: 16-25 % per binary / ~100 MiB total; Kubernetes reports 16-37 % on its own.
- containerd
pluginunimport: −245 MiB (~20 %) on main Linuxamd64artifacts, benefiting ~75 % of users.
Caveats¶
- All numbers are Linux
amd64unless otherwise noted — theamd64-vs-arm64divergence was the whole story behind theplugin-package investigation, so per-arch results vary. - Forking stdlib
text/template/html/templateis a maintenance burden — Datadog accepted it because open Go issue #72895 to statically disable method calls is still unresolved. Not every team should fork stdlib; most can wait for the upstream fix or scope the method-DCE enable to binaries that don't use templates. - Method-DCE is safe when the full
reflect.MethodByNamecall graph is audited; accidentally dropping a method that's actually called at runtime → panic. Datadog's initial idea ("instrument binaries to emit a method-use list, edit compiler artifacts to force removal of others") would have had this exact failure mode; they discarded it in favour of source-level patches that the compiler/linker can reason about statically. - Not all dependency bloat is accidental. Some packages genuinely
need their imports. The audit's job is to separate accidental
inclusions (collapse to one function) from legitimate ones
(don't touch);
goda reachis what makes the distinction. - Applies to Go binaries only. C/C++ / Rust / JVM toolchains have
their own dead-code elimination stories (LTO, Proguard,
panic=abort, etc.).
Source¶
- Original: https://www.datadoghq.com/blog/engineering/agent-go-binaries/
- Raw markdown:
raw/datadog/2026-02-18-how-we-reduced-the-size-of-our-agent-go-binaries-by-up-to-77-5bfd5999.md
Related¶
- companies/datadog — Datadog engineering org (Tier 3).
- systems/datadog-agent — the artifact family that shrank.
- systems/go-compiler / systems/go-linker — the machinery whose behaviour this post documents.
- systems/goda / systems/go-size-analyzer / systems/whydeadcode — the OSS debugging kit that made the program feasible.
- systems/containerd — accidental root cause of the 245-MiB
plugin-import regression onamd64. - systems/kubernetes — both a victim (526 accidental packages) and beneficiary (16-37 % size cut after adopting method-DCE).
- concepts/binary-size-bloat / concepts/dead-code-elimination / concepts/go-build-tags / concepts/transitive-dependency-reachability / concepts/reflect-methodbyname-linker-pessimism / concepts/go-plugin-dynamic-linking-implication — new concept pages distilled from this post.
- patterns/build-tag-dependency-isolation / patterns/single-function-forced-package-split / patterns/upstream-the-fix / patterns/measurement-driven-micro-optimization — applicable / newly-created patterns.