Vercel — Making agent-friendly pages with content negotiation¶
Summary¶
Vercel's 2026-04-21 engineering post documenting their
production implementation of HTTP markdown
content negotiation across vercel.com/blog and
vercel.com/changelog, plus the parallel introduction of
markdown sitemaps for agent-driven content discovery. The
post is a how-to walkthrough with one canonical payload-
reduction datum and concrete Next.js
implementation snippets.
Mechanism: a next.config.ts
rewrites rule inspects the Accept header on every incoming
request; if it contains text/markdown, the request is routed
via beforeFiles rewrite to a dedicated /md/:path* route
handler. The route handler reads the slug, fetches the post
from the CMS (rich text), converts it to markdown on the
fly (code-block syntax markers preserved, heading hierarchy
preserved, links preserved), and returns the markdown body with
Content-Type: text/markdown. Browsers that send Accept:
text/html, */* skip the rewrite and hit the normal page route.
Canonical payload datum: HTML version ~500 KB → markdown version ~3 KB = 99.37 % reduction in over-the-wire bytes for the same blog post URL. Vercel frames this in context-window terms: "smaller payloads mean they can consume more content per request and spend their budget on actual information instead of markup."
The post also introduces markdown sitemaps as a second
primitive: instead of a flat XML sitemap.xml, serve a
hierarchical *.md sitemap at paths like
/blog/sitemap.md and
/docs/sitemap.md that
gives the agent a structured table of contents with human-
readable titles, ordered by date (for blog) or nested by
parent-child relationship (for docs). A recursive
renderTocItems function preserves section hierarchy with
indented markdown list nesting.
A third mechanism complements the header: a
<link rel="alternate" type="text/markdown" title="LLM-
friendly version" href="/llms.txt" /> tag in HTML <head>
for agents that don't send the Accept header. Three layered
discovery mechanisms in total: Accept header → markdown
sitemap enumeration → link rel="alternate" fallback.
Key takeaways¶
-
99.37 % payload reduction on a real blog post is the one canonical quantitative datum. The HTML version of a single Vercel blog post is ~500 KB; the markdown version of the same URL is ~3 KB. The post frames this as 5× more content per agent context window — but the relationship is orders-of-magnitude stronger: if the numbers hold, a markdown fetch fits ~160 HTML equivalents of content in the same bytes. This is Vercel's second- vendor independent measurement of markdown-content- negotiation payload savings, after Cloudflare's 2026-04-17 "up to 80 % token reduction" claim; the two numbers measure different things (Vercel = server-side over-the- wire bytes on one blog post; Cloudflare = client-side LLM-consumption tokens aggregated across docs) and don't conflict — both support the thesis that HTML is extremely wasteful for LLM consumption. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)
-
Content-negotiation beats separate
.mdURLs because "no site-specific knowledge" is required. The post's architectural argument: "This works better than hosting separate.mdURLs because content negotiation requires no site-specific knowledge. Any agent that sends the right header gets markdown automatically, from any site that supports it." An agent written for Cloudflare's/index.mdURL scheme won't necessarily know to try/md/for Vercel; an agent that sendsAccept: text/markdowngets the right content from both without any per-site configuration. The Accept-header discovery composes across sites in a way URL-pattern conventions don't. This is a real architectural argument — not just a convenience framing. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Two-part Next.js implementation: rewrite rule + dedicated route handler. The
next.config.tsrewrite rule usesbeforeFiles(runs before static files) and matches theAcceptheader viahaswithvalue: "(.*)text/markdown(.*)", rewriting/blog/:path*to/blog/md/:path*(and the same for/changelog). Theroute handlerat the destination (conceptuallyapp/blog/md/[...slug]/route.ts) reads the slug, callsgetMarkdownContent(slug)to convert CMS rich text to markdown on request, and returns the body withContent-Type: text/markdown. No static.mdfiles on disk. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Markdown sitemap is a new primitive, distinct from markdown content negotiation. Content negotiation answers "get me this one URL as markdown"; a markdown sitemap answers "what URLs exist on this site, hierarchically, with human-readable titles" — a structured agent- navigable index. Vercel ships two: a flat by-date list at
/blog/sitemap.mdand a hierarchical nested list at/docs/sitemap.mdwith parent-child relationships preserved via indented markdown. The recursiverenderTocItems(items, indent)function is the canonical implementation:sitemap += '${indent}- [${item.title}](/${item.path})\n'withindent + ' 'for each recursion. Contrasts with XML sitemaps which are flat URL lists with no titles, no hierarchy, no semantic grouping. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Three-layer discovery stack: Accept header → markdown sitemap →
link rel="alternate"tag. Vercel's full agent-friendly-page architecture supports three independent discovery paths: (1) the primaryAccept: text/markdownheader, which content-negotiates any URL; (2) the markdown sitemap, which agents can fetch to enumerate content; (3) the<link rel="alternate" type="text/markdown" href="/llms.txt">tag in HTML<head>, which lets an agent that fetched HTML discover the markdown alternative. Each covers a different agent capability gap: (1) requires the agent to know the header; (2) requires the agent to fetch the sitemap first; (3) requires the agent to parse the HTML<head>before giving up on HTML. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Sync across HTML and markdown via Next.js 16 remote cache + shared slugs. Operational caveat: the HTML version and markdown version of the same URL are generated from the same CMS source at request time, cached via Next.js 16
use cacheremote cache keyed by shared slug, so when CMS content updates, both variants invalidate together. This sidesteps the rendering-parity failure mode flagged on concepts/markdown-content-negotiation ("if the markdown is a stripped-down pre-render, content can diverge from the HTML version over time unless the server renders both from one source of truth"). Vercel's implementation renders both from one source of truth. The post doesn't name aVary: Acceptcache-variant header explicitly — because the rewrite routes to a different URL (/md/*) under the hood, cache keys are naturally distinct. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Rich-text-to-markdown conversion preserves structure: code syntax highlighting, heading hierarchy, and links. The post highlights three specific structural preservations in the on-the-fly conversion: code blocks retain their
languagemarker (for syntax highlighting in the agent's renderer); headings retain their hierarchy (h1 / h2 / h3); links remain functional. "The agent receives the same information as the HTML version, just in a format optimized for token efficiency." This is a subtle but load-bearing claim: a naïve HTML-to-markdown strip would lose code-language hints and collapse heading levels; the preservation here is deliberate. For CMS rich text as source, the conversion is a one-pass AST walk from CMS block types to markdown equivalents — not HTML-regex stripping. Sites whose content is already authored in markdown skip this conversion entirely: "if your content is already authored in markdown, you can serve it directly without a conversion step." (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Markdown sitemap route handler is dynamic, not static. The route handler at
/blog/sitemap.mdcallsgetAllBlogPosts()at request time, sorts by date desc, and renders a markdown list. Vercel usesexport const dynamic = 'force-static';— so the markdown sitemap is statically built, not regenerated per request, with revalidation tied to the CMS-update pipeline (not disclosed in detail). For docs, the recursivecreateTableOfContents('content')builds the tree dynamically at build time from the content directory layout. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Content negotiation is standard RFC 9110 applied to
text/markdown. "Content negotiation ... is a standard HTTP mechanism where the client specifies its preferred format via theAcceptheader, and the server returns the matching representation." The post does not invent a new mechanism — it applies a 30-year- old HTTP spec to a new MIME type.text/markdownis a registered IANA MIME type (RFC 7763, published 2016). The agent-adoption dimension is new; the protocol machinery is not. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
"Many agents already send
Accept: text/markdown" — the post's implicit empirical claim, though no agent list is provided. Cross-references: Checkly's February 2026 State of AI Agent Content Negotiation research tested 7 agents and found 3 (Claude Code, OpenCode, Cursor) send the header by default. Vercel's post doesn't cite that data, but the "many agents" framing is defensible on the Checkly evidence. The link rel=alternate fallback pattern is the hedge against the fact that most agents don't send the header yet. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation) -
Vercel's implementation covers
/blogand/changelog, not the whole site. Scope of the rewrite rules in the postednext.config.tssnippet:beforeFiles: [markdownRewrite('/blog'), markdownRewrite('/changelog')]— two prefixes only. Othervercel.compaths (e.g.,/pricing,/templates, marketing pages) are not covered by this scheme.vercel.com/docsis served by a separate subsystem that the post references via its markdown sitemap but does not describe the rewrite for. Limited per-section rollout is an architectural choice, not an oversight. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)
Architectural mechanism (Next.js implementation)¶
Two components, both in standard Next.js primitives:
1. next.config.ts rewrite rule (paraphrased from the
posted snippet, which is slightly mangled by the raw
extraction):
import type { NextConfig } from 'next';
function markdownRewrite(prefix: string) {
return {
source: `${prefix}/:path*`,
has: [
{ type: 'header', key: 'accept', value: '(.*)text/markdown(.*)' },
],
destination: `${prefix}/md/:path*`,
};
}
const nextConfig: NextConfig = {
async rewrites() {
return {
beforeFiles: [markdownRewrite('/blog'), markdownRewrite('/changelog')],
};
},
};
export default nextConfig;
Every incoming request hits this rule. If its Accept header
contains text/markdown (regex match), the URL is rewritten
to the /md/ subtree; otherwise, the rule does nothing and
Next.js routes the request normally.
2. Dedicated route handler at app/blog/md/[...slug]/route.ts:
import { notFound } from 'next/navigation';
import { getMarkdownContent } from '@/lib/content';
export async function GET(
request: Request,
{ params }: { params: Promise<{ slug?: string[] }> }
) {
const { slug } = await params;
const content = getMarkdownContent(slug?.join('/') ?? 'index');
if (!content) notFound();
return new Response(content, {
headers: {
'Content-Type': 'text/markdown',
},
});
}
getMarkdownContent() does the CMS rich-text → markdown
conversion on the fly. No static .md files on disk; no
per-post build step; no duplicate content source.
Markdown sitemap implementation¶
Blog (flat, date-sorted):
import { getAllBlogPosts } from '@/app/content';
export const dynamic = 'force-static';
export async function GET() {
const posts = await getAllBlogPosts();
const lines = posts
.sort((a, b) => new Date(b.date).getTime() - new Date(a.date).getTime())
.map((post) => `- [${post.title}](/blog/${post.slug}.md)`);
const sitemap = `# Blog sitemap\n\n${lines.join('\n')}`;
return new Response(sitemap, {
headers: { 'Content-Type': 'text/markdown' },
});
}
Docs (recursive, hierarchy-preserving):
import { createTableOfContents, TocItem } from '@lib/content';
function renderTocItems(items: TocItem[], indent = '') {
let sitemap = '';
for (const item of items) {
sitemap += `${indent}- [${item.title}](/${item.path})\n`;
if (item.children) {
sitemap += renderTocItems(item.children, `${indent} `);
}
}
return sitemap;
}
export async function GET() {
const tableOfContents = createTableOfContents('content');
const sitemap = `# Documentation sitemap\n\n${renderTocItems(tableOfContents)}`;
return new Response(sitemap, { headers: { 'Content-Type': 'text/markdown' } });
}
Two-space indent per recursion level preserves the parent- child relationship in the markdown output; agents that render markdown see a nested bullet list that mirrors the docs tree.
Three-layer agent discovery stack¶
| Layer | Mechanism | Prerequisites on agent |
|---|---|---|
| 1 | Accept: text/markdown request header |
Must know to send the header |
| 2 | Markdown sitemap at /<section>/sitemap.md |
Must fetch sitemap and parse markdown |
| 3 | <link rel="alternate" type="text/markdown" href="/llms.txt"> in HTML <head> |
Must parse HTML <head> before processing body |
Each layer covers a different agent-implementation gap. The
link rel="alternate" tag is the safety net for HTML-fetching
agents that ignore both the header and the sitemap.
Operational numbers¶
- ~500 KB — HTML size of one representative Vercel blog post.
- ~3 KB — markdown size of the same post.
- 99.37 % — payload reduction.
- ~160× — implied bytes-per-bytes ratio. HTML is two orders of magnitude larger than markdown for the same content.
Caveats¶
- One quantitative datum. The 500 KB → 3 KB measurement is stated for "the HTML version of this page" (where "this page" is the blog post itself). No distribution across the full blog corpus; no p50/p95 payload-size numbers. The 99.37 % figure may not generalise — blog posts with many images, many embedded snippets, or sparse text would have different ratios. The 500 KB baseline is also uncharacteristically large for markdown blog content; most of the HTML size is likely inline bundled JavaScript, hydration state, and CSS, not literal blog content.
- No agent-adoption data on the Vercel blog specifically.
The post doesn't say "X % of traffic now sends
Accept: text/markdown" or "Y agents are benefitting". No post-deployment adoption measurement. - No cache-key detail.
Vary: Acceptis not mentioned; Vercel sidesteps the problem by rewriting to a different URL under the hood (/md/*) so cache keys are naturally distinct — but the post doesn't explicitly confirm this design decision. - Narrow coverage. Only
/blogand/changelogare covered by the rewrite snippet shown./docshas markdown sitemaps but the rewrite mechanism for individual docs pages is not described./pricing,/templates, and other marketing pages are not covered at all. - Rich-text-to-markdown conversion correctness not discussed. Tables with merged cells, footnotes, custom CMS block types, code blocks with line-number anchors — none of the hard edge cases are discussed. "The conversion preserves structure" is a one-line claim, not a case analysis.
- No
robots.txt/llms.txtintegration detail. Thelink rel="alternate"tag points at/llms.txtin the example but the post doesn't show the/llms.txtcontent or relationship to the markdown sitemap. - Product-voice framing. The post is tagged "see the knowledge base guide for an implementation reference" at the end — a soft marketing CTA. The body is genuinely architectural, but the intent is also to position Vercel as the right platform for agent-ready content.
Source¶
- Original: https://vercel.com/blog/making-agent-friendly-pages-with-content-negotiation
- Raw markdown:
raw/vercel/2026-04-21-making-agent-friendly-pages-with-content-negotiation-951519f2.md
Related¶
- companies/vercel
- systems/nextjs — the framework where the rewrite rule and route handler live.
- systems/vercel-edge-functions — Vercel-platform-level
where this would run in practice for
vercel.com. - concepts/markdown-content-negotiation — the primary primitive this post implements. Vercel's second-vendor independent instance after Cloudflare.
- concepts/markdown-sitemap — new canonical concept from this post.
- concepts/sitemap — the XML-sitemap parent concept that markdown sitemaps contrast with.
- concepts/llms-txt — the complementary curation
primitive;
link rel="alternate"in this post points at/llms.txt. - concepts/agent-readiness-score — external grading surface that rewards markdown content negotiation as a Content-for-LLMs checkbox.
- patterns/accept-header-rewrite-to-markdown-route — canonical wiki pattern from this post's Next.js implementation.
- patterns/link-rel-alternate-markdown-discovery — canonical wiki pattern for the third discovery layer.
- patterns/dynamic-index-md-fallback — Cloudflare's URL-based alternative to the Accept-header approach; contrast-of-implementations for the same underlying concept.
- patterns/hidden-agent-directive-in-html — a fourth adoption technique not described in this post but complementary.