From latency to instant: Modernizing GitHub Issues navigation performance (GitHub Engineering, 2026-05-14)¶
Summary¶
GitHub Engineering's 2026-05-14 retrospective from Alexander on the
GitHub Issues team on a multi-quarter perf rewrite of the issues#show
route — the page rendered when a user opens an individual Issue. The
goal was not marginal backend wins but changing how Issue pages load
end-to-end by shifting work to the client and optimising perceived
latency. The rewrite introduces a client-side caching layer backed
by IndexedDB, a preheating strategy that
proactively walks high-intent issue references to pre-fill the cache
without becoming a freshness-enforcement loop, an in-memory cache
tier in front of IndexedDB for synchronous reads, and a
service worker that intercepts hard
navigations and signals the server (via a request header) when local
data exists so the server can return a thin HTML shell instead of
re-rendering. Measurement is anchored to HPC (Highest Priority
Content) — GitHub's internal metric closely aligned with
Web Vitals LCP — bucketed into Instant
(<200 ms) / Fast (<1000 ms) / Slow (≥1000 ms), with the
team-philosophy shift from minimising the p99 tail to maximising
the share of navigations in the fast/instant buckets
(distribution
quality). Quantified outcomes: cache rollout took React
soft-navigation instant share from 4 % → 22 % at a ~33 %
cache-hit ratio with a 4.7 % measured server/cache divergence
treated as an explicit operating envelope; preheating took React
soft-navigation instant share to ~70 %, overall instant for
issues#show to ~30 %, and cache-hit ratio to ~96 %;
HPC P10 dropped from ~600 ms → 70 ms, P25 from ~800 ms → 120 ms,
P50 from ~1200 ms → 700 ms, P75 from 1800 ms → 1400 ms, P90 from
2400 ms → 2100 ms. Baseline navigation mix at the start of the
work was 57.6 % hard / ~5 % turbo / 37.5 % React (soft) with
HPC 2.05 / 1.76 / 1.04 s respectively — so the dominant path
was also the slowest, forcing the rewrite to address all three
classes rather than only soft navigations. The post also names the
Rails-to-React boundary as the structural reason hard
navigations dominate: cross-frontend transitions force a full
cold boot. Genre is engineering-blog architectural retrospective
with concrete numbers, named primitives, and clean tradeoff
discussion.
Key takeaways¶
-
HPC (Highest Priority Content) is GitHub's internal user-perceived rendering metric, anchored to a single browser- selected element à la LCP. "We use HPC (Highest Priority Content), an internal metric closely aligned with Web Vitals LCP, to measure when the primary content (the content users care about) on the page is first rendered. Like LCP, this is anchored to a single HTML element selected by the browser, which on issue pages is most often the issue title or the issue body." Bucket thresholds are operationally important: Instant: HPC < 200 ms / Fast: HPC < 1000 ms / Slow: HPC ≥ 1000 ms. "The <200 ms bucket maps to interactions that feel immediate in real workflows, while the <1000 ms bucket captures experiences that are still acceptable but no longer invisible to users." See concepts/highest-priority-content-hpc. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Measurement philosophy shifted from minimising p99 to maximising fast/instant bucket share — distribution quality over tail control. "Historically, we dedicated significant effort to tracking the p90 and p99 of the HPC and minimizing the worst tail of the distribution. While this work remains important, it does not inherently ensure that the product feels fast for the majority of users. It is possible to enhance the p99 of the HPC while still leaving the median experience feeling sluggish. For this initiative, we shifted focus toward distribution quality: how many navigations land in our fast and instant buckets across the whole population? The goal is not just fewer terrible outliers. It's to make speed the default path for the majority of sessions." Canonical wiki framing of the distribution- quality vs tail-control axis as a measurement-philosophy choice that is independent of metric definition. Compare to the tail-latency-at-scale posture which the GitHub team was moving away from (not abandoning — they explicitly say tail work "remains important"). (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Three-way navigation taxonomy + measured baseline drove the architecture. "We identified three primary navigation types: Hard navigation: a full browser load (cold start or refresh) where we pay the full cost of network, server rendering, asset loading, JavaScript boot and React hydration. Turbo navigation: a Rails Turbo transition that updates targeted page regions without a full reload. It avoids some hard-navigation overhead but still depends heavily on server-rendered responses. Soft navigation (React): a client-side transition inside the existing React runtime, where we can often avoid full page bootstrap costs." Measured baseline distribution: 57.6 % hard / 37.5 % React (soft) with HPC 2.05 s hard / 1.76 s turbo / 1.04 s react. "That distribution made one thing obvious: the dominant path was also the slowest. Any strategy focused only on React soft navigations could improve part of the experience, but it could not move overall perceived performance enough on its own." The Rails ↔ React frontend boundary is named as the structural reason hard navigations dominate: "During that transition, many user journeys cross the Rails/React boundary. When that happens — for example, navigating from a Rails page into Issues — the browser often has to do a full hard navigation and cold boot." See concepts/navigation-mix. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Step 1 — IndexedDB-backed stale-while-revalidate was the highest-leverage starting point because the React runtime is already alive on soft navigations and the dominant cost is data fetch. "In this path, the runtime is already alive, so the dominant cost is usually data fetch latency, not application boot. If we could remove network from repeated visits, we could move a large slice of traffic into the instant bucket." Why IndexedDB over alternatives: "Durable browser storage that survives tab closes and browser restarts, unlike memory-only stores; indexed object-store model, which gives efficient key-based lookups for issue query payloads; larger practical quota than localStorage, making it appropriate for real working sets." The stale-while- revalidate semantics: "Read path: on soft navigation, attempt to hydrate from local cache first and render immediately. Revalidation path: issue a background network request for freshness and reconcile the in-memory store if data changed. Failure behavior: when network is degraded, users still get a usable page from cache, with freshness reconciled once connectivity recovers, introducing a new graceful-degradation model." Result after broad rollout: "approximately 22 % of React navigations became instant — up from 4 % pre-launch — representing about 15 % of total request volume. Observed cache-hit ratio landed around one-third (~33 %), which was consistent with the earlier revisit analysis." Pre-workstream analysis estimated a ~30 % cache-hit ratio as the viability threshold before committing to the architecture — i.e. the team set a numeric cache-hit floor they had to clear for the design to make sense, rather than building first and measuring later. See patterns/stale-while-revalidate-from-indexeddb. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Cache/server divergence rate is treated as an explicit operating envelope. "The main tradeoff is controlled staleness. We measured server/cache divergence at about 4.7 % and treated that as an explicit operating envelope: acceptable for the perceived speed gains on soft navigations, with safeguards to limit user-visible inconsistency." The architectural framing GitHub uses is pointed: "This is not 'cache or correctness.' It is latency-first rendering with asynchronous consistency checks on the same navigation." The 4.7 % divergence is the stale-cache rate at read time before revalidation reconciles it; whether 4.7 % is acceptable is a product call, not a technical absolute. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Step 2 — Preheating: cache-population logic, not freshness-enforcement logic, fundamentally distinct from traditional prefetch. "The naive answer was obvious: prefetch every likely next issue as early as possible. We explored that direction and quickly ran into the real constraint, which was not implementation complexity but capacity. On high-fanout surfaces such as issue lists, dashboards, and projects, eager prefetching amplifies request volume, creates N+1-style access patterns and pushes unnecessary compute onto the system for pages a user may never open. So we changed the objective. Instead of trying to make prefetched data always fresh, we optimized for a cheaper and more scalable condition: make sure some usable data is already local by the time the user clicks. That is preheating. Preheating proactively walks high-intent issue references and prepares cache entries ahead of navigation, but it only hits the network when the issue is not already present in the client cache. If usable data already exists, preheating stops. This makes it fundamentally different from traditional preloading. It is cache-population logic, not freshness-enforcement logic." Operational discipline: "Operationally, preheating is triggered from high-intent surfaces such as issue lists, dashboards, projects, and dependency views. Requests run on low-priority workers, are strictly rate-limited and are guarded by circuit breakers, so the mechanism backs off under pressure. User-initiated work always takes precedence over speculative fetches, allowing us to avoid the noisy-neighbor problem and keep the system stable while still improving cache-hit ratios for real user navigations." See concepts/preheating. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
In-memory cache tier in front of IndexedDB closed the remaining async gap on the soft-navigation critical path. "To support that model efficiently, we introduced an in-memory cache version in front of IndexedDB. IndexedDB gives persistence across tabs and sessions, but it is still asynchronous and therefore not free on the critical path. The in-memory layer sits between the active in-memory store and persistent storage, allowing hot issue payloads to be served synchronously without paying even the IndexedDB read cost. In practice, this removes another async boundary from soft navigation and materially increases the probability of rendering directly from memory." Generalises to: pair fast small cache with slow large storage at the browser altitude — the in-memory tier is the fast/small leg, the IndexedDB tier is the slow/large persistent leg. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Preheating + in-memory tier results: ~70 % of React navigations instant, ~30 % overall instant, ~96 % cache-hit ratio. "The result was a large shift in distribution. After rolling out preheating broadly, instant navigations for issues#show increased to roughly 30 % overall. For React navigations specifically, up to ~70 % became instant. Cache-hit ratio rose to roughly 96 %." From baseline 4 % instant → 22 % instant (cache only) → ~70 % instant (cache + preheating) on the React soft-navigation slice — a 17.5× improvement on the instant-bucket share of soft navigations. The 96 % cache-hit ratio is high enough that eviction policy and warming time become the dominant design considerations rather than the cache hit/miss distinction. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Step 3 — Service worker extends local-first to hard navigations via a custom request header signaling cache state. "For issues#show, our service worker extends the same local-first model we built for React navigations. When the browser starts a navigation request for an issue page, the service worker intercepts it and checks whether the issue data is already available in local cache. If it is, the worker annotates the outgoing request with a specific header that tells the server it can skip a substantial amount of work. When the service worker detects a cache hit, it signals to the server via a request header. From there, the navigation splits into two paths: Cache hit path: return a thin HTML shell (layout + minimal markup + JS), and let React render from the locally cached issue payload. Cache miss path: return the normal response (server loads data and SSRs the page). This is a strict optimization: if the cache is cold, stale, or the service worker isn't available, behavior falls back to the standard server-rendered path." Crucial property: the service-worker path is purely additive — failure modes (no service worker, cold cache, stale cache) all degrade gracefully back to the SSR path. See patterns/service-worker-cache-hint-header. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Service worker had outsized impact on Turbo navigations because Turbo paths are server-response-time-bound. "This had an especially strong effect on Turbo navigations, because Turbo paths are still heavily constrained by server response time. Once the service worker can signal that issue data is already present, the server spends much less time computing the application fragment, and Turbo benefits almost immediately from that reduction in backend work." The Turbo-specific gain is the architectural lesson: a cache-hint header that lets the server skip work helps the server-bound navigation classes the most, even if the client-side cache was originally designed for client-side rendering. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Hard-navigation cache-hit gains shift the bottleneck from SSR time to JS download/execution; addressed via route-level code splitting + component-level lazy loading + intent prefetch. "Hard-navigation gains are real, but they are less immediately visible than Turbo gains: on cache-hit hard navigations, so we trade SSR time for client-side rendering. The critical path now becomes JavaScript download and execution. To reduce that cost, we split code by route using React.lazy and dynamic route preloading, so only the code required for the current route is fetched up front. We apply the same principle at the component level, loading only what's necessary for the initial view and deferring non-critical modules. For example, we only fetch the issue editor bundle when a user enters edit mode, and use intent-based prefetching (like hover) to hide that latency without bloating the initial bundle." See patterns/code-split-by-route-and-intent-prefetch. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
-
Cumulative HPC distribution shift (entire
issues#showtraffic) — the bottom of the distribution moves dramatically more than the top. Reported percentile deltas: P10 ~600 ms → 70 ms (≈ 8.6× compression); P25 ~800 ms → 120 ms (≈ 6.7×); P50 ~1200 ms → 700 ms (median crosses the 1 s threshold from slow → fast bucket); P75 1800 ms → 1400 ms (≈ -22 %); P90 2400 ms → 2100 ms (≈ -12.5 %). The author calls out the asymmetry explicitly: "The pattern that stands out is the outsized improvement in the lower percentiles. P10 and P25 compressed dramatically because cached and preheated navigations now dominate that part of the distribution. The median improved meaningfully but is still shaped by cold-start traffic. And the upper tail, while better, reflects the hard-navigation paths where JavaScript boot and client rendering are now the bottleneck — exactly the area we are targeting next." This is the distribution- quality philosophy made operationally legible: focus optimisation effort where the largest mass of users sit rather than at the worst-performing tail. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance) -
Stated next steps: backend rewrites for low-latency delivery + edge UI delivery layer. "The next phase is about moving bigger rocks. We are planning targeted rewrites of parts of our backend stack optimized explicitly for low-latency delivery and are investing in a modern UI delivery layer closer to the edge to reduce round trips and improve response time further." No detail on what "closer to the edge" means architecturally; signal the cold-start tail (P75/P90) is the next bottleneck after the client-cache + service-worker work has already pulled the bottom of the distribution close to instant. (Source: sources/2026-05-14-github-from-latency-to-instant-modernizing-github-issues-navigation-performance)
Operational numbers (single-source)¶
All figures from the GitHub Engineering post unless otherwise noted.
HPC bucket thresholds (operational definition of "fast")¶
| Bucket | HPC range | Description |
|---|---|---|
| Instant | < 200 ms | feels immediate |
| Fast | < 1000 ms | acceptable, no longer invisible |
| Slow | ≥ 1000 ms | user-noticeable |
Pre-rewrite navigation mix and HPC¶
| Class | Share | HPC (median, approx.) |
|---|---|---|
| Hard | 57.6 % | 2.05 s |
| Turbo | (~5 %) | 1.76 s |
| Soft (React) | 37.5 % | 1.04 s |
Cache rollout (Step 1 — IndexedDB SWR)¶
| Metric | Pre-launch | Post-rollout |
|---|---|---|
| React-soft-nav instant share | 4 % | 22 % |
| Share of total request volume affected | — | ~15 % |
| Cache-hit ratio (React soft-nav) | (n/a) | ~33 % |
| Cache-hit-ratio viability threshold | ~30 % | (cleared) |
| Server/cache divergence rate | (n/a) | 4.7 % |
Preheating + in-memory tier rollout (Step 2)¶
| Metric | Value |
|---|---|
| React-soft-nav instant share (post) | ~70 % |
Overall issues#show instant share |
~30 % |
| Cache-hit ratio | ~96 % |
Cumulative HPC percentile shifts (full issues#show traffic)¶
| Percentile | Pre | Post | Delta | Bucket cross? |
|---|---|---|---|---|
| P10 | ~600 ms | 70 ms | -88 % | → Instant |
| P25 | ~800 ms | 120 ms | -85 % | → Instant |
| P50 | ~1200 ms | 700 ms | -42 % | Slow → Fast |
| P75 | 1800 ms | 1400 ms | -22 % | (still Slow) |
| P90 | 2400 ms | 2100 ms | -12.5 % | (still Slow) |
Architecture diagrams (named, not reproduced)¶
The post includes seven labelled images:
- Pre-work navigation distribution — bar chart 57.6 % hard / 37.5 % React.
- Pre-work HPC by navigation type — bar chart 2.05 / 1.76 / 1.04 s.
- Soft-nav data-flow diagram — client-cache layer between React store and IndexedDB.
- Post-cache HPC distribution — instant / fast / slow bucket shift after IndexedDB SWR.
- Preheating flow — "Look at issues index → For each issue in the list trigger a preheat request → Is data in the cache present? If yes, add to IndexedDB. If no, fetch data, then add to IndexedDB."
- In-memory cache architecture — three-tier hierarchy: in-memory store → in-memory cache → IndexedDB.
- Service worker interception flow — browser navigation request → service worker checks local cache → if hit, annotate request header → server returns thin HTML shell → React renders from local cache.
The first labelled image is the only one that gives concrete share numbers for navigation classes; the second is the only one that gives per-class HPC medians.
Caveats¶
- No fleet-scale absolute numbers. Share-of-traffic and
percentile deltas are reported, but not absolute QPS, request
volume per region, or cumulative bytes saved. "Every week
millions of people around the world rely on Issues" is the
only scale anchor; the Issues search
page reports ~2 kQPS for search specifically, which is a
different surface from
issues#show. - The 4.7 % divergence figure is read-time-only. Whether user-visible inconsistency is acceptable is a product judgement; "safeguards to limit user-visible inconsistency" are mentioned but not detailed.
- No service-worker registration/unregistration discipline detail. "if the service worker isn't available, behavior falls back to the standard server-rendered path" — but the post doesn't say how SWs are versioned, how stale SWs are evicted, or what happens during the SW activation race.
- Code-splitting + intent prefetch detail is shallow. "split code by route using React.lazy and dynamic route preloading" and "intent-based prefetching (like hover)" — no bundle-size numbers, no hover-debounce timing, no measurement of how much intent-prefetch traffic ends up unused.
- The "modern UI delivery layer closer to the edge" forward-looking statement is unspecified architecturally. No mention of which CDN, whether SSR is moving to the edge, or how that interacts with the IndexedDB / service-worker substrate.
- No code samples or pseudocode published, unlike the GHES CCR rewrite post which published the auto-follow + bootstrap pseudocode.
- Single-author retrospective post, not a postmortem with named-incident anchoring. The framing is "here's how we rewrote the perf stack and what shifted" rather than "here's an outage and how we recovered."
- Skip eligibility check: post is on the engineering sub-blog (architecture-optimization category), not the changelog or PR sub-blog; explicitly architecture-substantive with named primitives, measured numbers, and documented trade-offs. Tier-2 source, in scope per AGENTS.md *"engineering
- security posts are ingested eagerly."*
Source¶
- Original: https://github.blog/engineering/architecture-optimization/from-latency-to-instant-modernizing-github-issues-navigation-performance/
- Raw markdown:
raw/github/2026-05-14-from-latency-to-instant-modernizing-github-issues-navigation-51fdee1a.md
Related¶
- companies/github — parent company / source.
- systems/github-issues-show — the
issues#showroute surface this rewrite targets (system page). - systems/github-issues — sibling system page (Issues search).
- systems/github-pull-requests — sibling 2026 React-rewrite performance retrospective at the PR-page surface; same philosophy (perceived-latency-first, distribution-quality measurement) at a different sub-product.
- systems/indexeddb — browser-persistent storage primitive used as the SWR cache substrate.
- systems/service-worker — browser network-intercepting worker primitive used to extend local-first to hard navs.
- systems/hotwire-turbo — the Rails-Turbo navigation class that benefits most from the service-worker cache-hint header.
- systems/react — the soft-navigation runtime that holds the in-memory store + IndexedDB read path.
- concepts/highest-priority-content-hpc — GitHub's internal LCP-aligned metric.
- concepts/distribution-quality-vs-p99-tail — the measurement-philosophy shift from minimising p99 to maximising fast/instant bucket share.
- concepts/navigation-mix — the hard / turbo / soft taxonomy GitHub uses as the perf-baselining frame.
- concepts/preheating — cache-population logic (vs freshness-enforcement) at high-fanout source surfaces.
- concepts/stale-while-revalidate-cache — the SWR primitive applied at the browser-cache altitude.
- concepts/cache-hit-rate — the load-bearing metric for the cache layer (33 % → 96 % over the rewrite).
- concepts/local-first-architecture — the framing GitHub borrowed for its client-side rendering model.
- concepts/user-perceived-latency — the user-facing latency shape HPC operationalises.
- concepts/core-web-vitals — Web Vitals (LCP) is the external standard HPC is "closely aligned with."
- concepts/react-hydration — the post-SSR boot cost the hard-navigation cache-hit path trades for client-side rendering.
- patterns/stale-while-revalidate-from-indexeddb — the client-side SWR pattern with browser-persistent storage.
- patterns/service-worker-cache-hint-header — the cache-state-signalled-via-request-header pattern.
- patterns/code-split-by-route-and-intent-prefetch — the bundle-size mitigation paired with the cache architecture.
- patterns/pair-fast-small-cache-with-slow-large-storage — the in-memory-cache-in-front-of-IndexedDB tier choice generalises to this pattern at browser altitude.
- patterns/async-refresh-cache-loader — the structural shape of "render from cache, refresh in the background" applied at the React store / IndexedDB altitude.