Skip to content

VERCEL 2026-04-21 Tier 3

Read original ↗

Vercel — Making agent-friendly pages with content negotiation

Summary

Vercel's 2026-04-21 engineering post documenting their production implementation of HTTP markdown content negotiation across vercel.com/blog and vercel.com/changelog, plus the parallel introduction of markdown sitemaps for agent-driven content discovery. The post is a how-to walkthrough with one canonical payload- reduction datum and concrete Next.js implementation snippets.

Mechanism: a next.config.ts rewrites rule inspects the Accept header on every incoming request; if it contains text/markdown, the request is routed via beforeFiles rewrite to a dedicated /md/:path* route handler. The route handler reads the slug, fetches the post from the CMS (rich text), converts it to markdown on the fly (code-block syntax markers preserved, heading hierarchy preserved, links preserved), and returns the markdown body with Content-Type: text/markdown. Browsers that send Accept: text/html, */* skip the rewrite and hit the normal page route.

Canonical payload datum: HTML version ~500 KB → markdown version ~3 KB = 99.37 % reduction in over-the-wire bytes for the same blog post URL. Vercel frames this in context-window terms: "smaller payloads mean they can consume more content per request and spend their budget on actual information instead of markup."

The post also introduces markdown sitemaps as a second primitive: instead of a flat XML sitemap.xml, serve a hierarchical *.md sitemap at paths like /blog/sitemap.md and /docs/sitemap.md that gives the agent a structured table of contents with human- readable titles, ordered by date (for blog) or nested by parent-child relationship (for docs). A recursive renderTocItems function preserves section hierarchy with indented markdown list nesting.

A third mechanism complements the header: a <link rel="alternate" type="text/markdown" title="LLM- friendly version" href="/llms.txt" /> tag in HTML <head> for agents that don't send the Accept header. Three layered discovery mechanisms in total: Accept header → markdown sitemap enumeration → link rel="alternate" fallback.

Key takeaways

  • 99.37 % payload reduction on a real blog post is the one canonical quantitative datum. The HTML version of a single Vercel blog post is ~500 KB; the markdown version of the same URL is ~3 KB. The post frames this as 5× more content per agent context window — but the relationship is orders-of-magnitude stronger: if the numbers hold, a markdown fetch fits ~160 HTML equivalents of content in the same bytes. This is Vercel's second- vendor independent measurement of markdown-content- negotiation payload savings, after Cloudflare's 2026-04-17 "up to 80 % token reduction" claim; the two numbers measure different things (Vercel = server-side over-the- wire bytes on one blog post; Cloudflare = client-side LLM-consumption tokens aggregated across docs) and don't conflict — both support the thesis that HTML is extremely wasteful for LLM consumption. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Content-negotiation beats separate .md URLs because "no site-specific knowledge" is required. The post's architectural argument: "This works better than hosting separate .md URLs because content negotiation requires no site-specific knowledge. Any agent that sends the right header gets markdown automatically, from any site that supports it." An agent written for Cloudflare's /index.md URL scheme won't necessarily know to try /md/ for Vercel; an agent that sends Accept: text/markdown gets the right content from both without any per-site configuration. The Accept-header discovery composes across sites in a way URL-pattern conventions don't. This is a real architectural argument — not just a convenience framing. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Two-part Next.js implementation: rewrite rule + dedicated route handler. The next.config.ts rewrite rule uses beforeFiles (runs before static files) and matches the Accept header via has with value: "(.*)text/markdown(.*)", rewriting /blog/:path* to /blog/md/:path* (and the same for /changelog). The route handler at the destination (conceptually app/blog/md/[...slug]/route.ts) reads the slug, calls getMarkdownContent(slug) to convert CMS rich text to markdown on request, and returns the body with Content-Type: text/markdown. No static .md files on disk. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Markdown sitemap is a new primitive, distinct from markdown content negotiation. Content negotiation answers "get me this one URL as markdown"; a markdown sitemap answers "what URLs exist on this site, hierarchically, with human-readable titles" — a structured agent- navigable index. Vercel ships two: a flat by-date list at /blog/sitemap.md and a hierarchical nested list at /docs/sitemap.md with parent-child relationships preserved via indented markdown. The recursive renderTocItems(items, indent) function is the canonical implementation: sitemap += '${indent}- [${item.title}](/${item.path})\n' with indent + ' ' for each recursion. Contrasts with XML sitemaps which are flat URL lists with no titles, no hierarchy, no semantic grouping. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Three-layer discovery stack: Accept header → markdown sitemap → link rel="alternate" tag. Vercel's full agent-friendly-page architecture supports three independent discovery paths: (1) the primary Accept: text/markdown header, which content-negotiates any URL; (2) the markdown sitemap, which agents can fetch to enumerate content; (3) the <link rel="alternate" type="text/markdown" href="/llms.txt"> tag in HTML <head>, which lets an agent that fetched HTML discover the markdown alternative. Each covers a different agent capability gap: (1) requires the agent to know the header; (2) requires the agent to fetch the sitemap first; (3) requires the agent to parse the HTML <head> before giving up on HTML. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Sync across HTML and markdown via Next.js 16 remote cache + shared slugs. Operational caveat: the HTML version and markdown version of the same URL are generated from the same CMS source at request time, cached via Next.js 16 use cache remote cache keyed by shared slug, so when CMS content updates, both variants invalidate together. This sidesteps the rendering-parity failure mode flagged on concepts/markdown-content-negotiation ("if the markdown is a stripped-down pre-render, content can diverge from the HTML version over time unless the server renders both from one source of truth"). Vercel's implementation renders both from one source of truth. The post doesn't name a Vary: Accept cache-variant header explicitly — because the rewrite routes to a different URL (/md/*) under the hood, cache keys are naturally distinct. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Rich-text-to-markdown conversion preserves structure: code syntax highlighting, heading hierarchy, and links. The post highlights three specific structural preservations in the on-the-fly conversion: code blocks retain their language marker (for syntax highlighting in the agent's renderer); headings retain their hierarchy (h1 / h2 / h3); links remain functional. "The agent receives the same information as the HTML version, just in a format optimized for token efficiency." This is a subtle but load-bearing claim: a naïve HTML-to-markdown strip would lose code-language hints and collapse heading levels; the preservation here is deliberate. For CMS rich text as source, the conversion is a one-pass AST walk from CMS block types to markdown equivalents — not HTML-regex stripping. Sites whose content is already authored in markdown skip this conversion entirely: "if your content is already authored in markdown, you can serve it directly without a conversion step." (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Markdown sitemap route handler is dynamic, not static. The route handler at /blog/sitemap.md calls getAllBlogPosts() at request time, sorts by date desc, and renders a markdown list. Vercel uses export const dynamic = 'force-static'; — so the markdown sitemap is statically built, not regenerated per request, with revalidation tied to the CMS-update pipeline (not disclosed in detail). For docs, the recursive createTableOfContents('content') builds the tree dynamically at build time from the content directory layout. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Content negotiation is standard RFC 9110 applied to text/markdown. "Content negotiation ... is a standard HTTP mechanism where the client specifies its preferred format via the Accept header, and the server returns the matching representation." The post does not invent a new mechanism — it applies a 30-year- old HTTP spec to a new MIME type. text/markdown is a registered IANA MIME type (RFC 7763, published 2016). The agent-adoption dimension is new; the protocol machinery is not. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • "Many agents already send Accept: text/markdown" — the post's implicit empirical claim, though no agent list is provided. Cross-references: Checkly's February 2026 State of AI Agent Content Negotiation research tested 7 agents and found 3 (Claude Code, OpenCode, Cursor) send the header by default. Vercel's post doesn't cite that data, but the "many agents" framing is defensible on the Checkly evidence. The link rel=alternate fallback pattern is the hedge against the fact that most agents don't send the header yet. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

  • Vercel's implementation covers /blog and /changelog, not the whole site. Scope of the rewrite rules in the posted next.config.ts snippet: beforeFiles: [markdownRewrite('/blog'), markdownRewrite('/changelog')] — two prefixes only. Other vercel.com paths (e.g., /pricing, /templates, marketing pages) are not covered by this scheme. vercel.com/docs is served by a separate subsystem that the post references via its markdown sitemap but does not describe the rewrite for. Limited per-section rollout is an architectural choice, not an oversight. (Source: sources/2026-04-21-vercel-making-agent-friendly-pages-with-content-negotiation)

Architectural mechanism (Next.js implementation)

Two components, both in standard Next.js primitives:

1. next.config.ts rewrite rule (paraphrased from the posted snippet, which is slightly mangled by the raw extraction):

import type { NextConfig } from 'next';

function markdownRewrite(prefix: string) {
  return {
    source: `${prefix}/:path*`,
    has: [
      { type: 'header', key: 'accept', value: '(.*)text/markdown(.*)' },
    ],
    destination: `${prefix}/md/:path*`,
  };
}

const nextConfig: NextConfig = {
  async rewrites() {
    return {
      beforeFiles: [markdownRewrite('/blog'), markdownRewrite('/changelog')],
    };
  },
};

export default nextConfig;

Every incoming request hits this rule. If its Accept header contains text/markdown (regex match), the URL is rewritten to the /md/ subtree; otherwise, the rule does nothing and Next.js routes the request normally.

2. Dedicated route handler at app/blog/md/[...slug]/route.ts:

import { notFound } from 'next/navigation';
import { getMarkdownContent } from '@/lib/content';

export async function GET(
  request: Request,
  { params }: { params: Promise<{ slug?: string[] }> }
) {
  const { slug } = await params;
  const content = getMarkdownContent(slug?.join('/') ?? 'index');
  if (!content) notFound();
  return new Response(content, {
    headers: {
      'Content-Type': 'text/markdown',
    },
  });
}

getMarkdownContent() does the CMS rich-text → markdown conversion on the fly. No static .md files on disk; no per-post build step; no duplicate content source.

Markdown sitemap implementation

Blog (flat, date-sorted):

import { getAllBlogPosts } from '@/app/content';

export const dynamic = 'force-static';

export async function GET() {
  const posts = await getAllBlogPosts();
  const lines = posts
    .sort((a, b) => new Date(b.date).getTime() - new Date(a.date).getTime())
    .map((post) => `- [${post.title}](/blog/${post.slug}.md)`);
  const sitemap = `# Blog sitemap\n\n${lines.join('\n')}`;
  return new Response(sitemap, {
    headers: { 'Content-Type': 'text/markdown' },
  });
}

Docs (recursive, hierarchy-preserving):

import { createTableOfContents, TocItem } from '@lib/content';

function renderTocItems(items: TocItem[], indent = '') {
  let sitemap = '';
  for (const item of items) {
    sitemap += `${indent}- [${item.title}](/${item.path})\n`;
    if (item.children) {
      sitemap += renderTocItems(item.children, `${indent}  `);
    }
  }
  return sitemap;
}

export async function GET() {
  const tableOfContents = createTableOfContents('content');
  const sitemap = `# Documentation sitemap\n\n${renderTocItems(tableOfContents)}`;
  return new Response(sitemap, { headers: { 'Content-Type': 'text/markdown' } });
}

Two-space indent per recursion level preserves the parent- child relationship in the markdown output; agents that render markdown see a nested bullet list that mirrors the docs tree.

Three-layer agent discovery stack

Layer Mechanism Prerequisites on agent
1 Accept: text/markdown request header Must know to send the header
2 Markdown sitemap at /<section>/sitemap.md Must fetch sitemap and parse markdown
3 <link rel="alternate" type="text/markdown" href="/llms.txt"> in HTML <head> Must parse HTML <head> before processing body

Each layer covers a different agent-implementation gap. The link rel="alternate" tag is the safety net for HTML-fetching agents that ignore both the header and the sitemap.

Operational numbers

  • ~500 KB — HTML size of one representative Vercel blog post.
  • ~3 KB — markdown size of the same post.
  • 99.37 % — payload reduction.
  • ~160× — implied bytes-per-bytes ratio. HTML is two orders of magnitude larger than markdown for the same content.

Caveats

  • One quantitative datum. The 500 KB → 3 KB measurement is stated for "the HTML version of this page" (where "this page" is the blog post itself). No distribution across the full blog corpus; no p50/p95 payload-size numbers. The 99.37 % figure may not generalise — blog posts with many images, many embedded snippets, or sparse text would have different ratios. The 500 KB baseline is also uncharacteristically large for markdown blog content; most of the HTML size is likely inline bundled JavaScript, hydration state, and CSS, not literal blog content.
  • No agent-adoption data on the Vercel blog specifically. The post doesn't say "X % of traffic now sends Accept: text/markdown" or "Y agents are benefitting". No post-deployment adoption measurement.
  • No cache-key detail. Vary: Accept is not mentioned; Vercel sidesteps the problem by rewriting to a different URL under the hood (/md/*) so cache keys are naturally distinct — but the post doesn't explicitly confirm this design decision.
  • Narrow coverage. Only /blog and /changelog are covered by the rewrite snippet shown. /docs has markdown sitemaps but the rewrite mechanism for individual docs pages is not described. /pricing, /templates, and other marketing pages are not covered at all.
  • Rich-text-to-markdown conversion correctness not discussed. Tables with merged cells, footnotes, custom CMS block types, code blocks with line-number anchors — none of the hard edge cases are discussed. "The conversion preserves structure" is a one-line claim, not a case analysis.
  • No robots.txt / llms.txt integration detail. The link rel="alternate" tag points at /llms.txt in the example but the post doesn't show the /llms.txt content or relationship to the markdown sitemap.
  • Product-voice framing. The post is tagged "see the knowledge base guide for an implementation reference" at the end — a soft marketing CTA. The body is genuinely architectural, but the intent is also to position Vercel as the right platform for agent-ready content.

Source

Last updated · 476 distilled / 1,218 read