Skip to content

PATTERN Cited by 1 source

State-eviction cron

Intent

Keep the hot-set of kernel / data-plane state bounded by running a periodic cron job that ruthlessly evicts stale state, where "stale" is defined by an idle-time or last-use heuristic. Relies on cheap re-provisioning to make eviction non-destructive.

When to use

  • The primitive has a state-capacity wall and the cost of holding unused state is real (slow ops, slow reloads, panics).
  • Re-provisioning evicted state is cheap (ideally: driven automatically by first-packet event).
  • You don't need to track "is this peer idle" at connection-close time — simpler to just evict periodically and let the user/client re-establish.

Shape

  cron --> list kernel peers
       --> for each peer:
             if last_active < threshold:
                 evict via Netlink / API
       --> done.

No coordination with the data plane beyond the kernel config API. No graceful drain. No notification to anyone. Evicted peers that are still wanted come back via the JIT path.

Canonical instance — Fly.io JIT WireGuard

"This fits nicely into the little daemon that already runs on our gateways to manage WireGuard, and allows us to ruthlessly and recklessly remove stale peers with a cron job." (Source: sources/2024-03-12-flyio-jit-wireguard-peers)

Ruthlessly and recklessly are the tonal markers. Under push-provisioning, eviction was destructive — "Nothing cleans up old peers. After all, you're likely going to come back tomorrow and deploy a new version of your app, or fly ssh console into it to debug something. Why remove a peer just to re-add it the next day?" Under JIT, the answer becomes obvious: evict whenever; if still wanted, the client's next handshake re-provisions the peer via the sniff → identify → pull → install path.

Fly.io doesn't disclose their specific threshold, but the descriptor "ruthless" + the outcome "rounds to none" on the stale-peer chart imply the eviction window is short relative to typical CI-job lifetime.

Why it's cheap under JIT

The move that makes eviction non-destructive is the pull-on-demand flip. Under push, an evicted peer stays evicted until the control plane re-pushes it — no automatic recovery. Under pull, an evicted peer is re-materialised on its next handshake without any control-plane action.

I.e.: eviction goes from "policy decision with operational risk" to "garbage collection on a cheap cron."

Parameters

  • Eviction threshold — how idle is stale? Fly.io doesn't publish theirs. Rule of thumb: ≤ typical reconnect interval for the workload that uses the state. Too short ⇒ thrash (evict, re-install constantly); too long ⇒ state wall.
  • Cadence — how often the cron runs. Typically ≤ threshold.
  • Scope — evict one-at-a-time or batch. Depends on API cost of the eviction primitive.

When it doesn't apply

  • When re-provisioning isn't cheap. If re-installing state requires human action, a paid lookup, or an expensive compute step, eviction reverts to being destructive.
  • When eviction itself is expensive. Some primitives serialise eviction with the hot path (flushing caches, pausing threads). A cron that adds latency on the data-plane path isn't free.
  • When "stale" is ambiguous. Idle-time is easy; semantic staleness ("this config version is outdated") isn't cron- friendly.

Seen in

Last updated · 200 distilled / 1,178 read