PATTERN Cited by 1 source
Snapshot sync from Postgres to repo¶
Pattern¶
Run a durable background orchestrator (e.g. Vercel Workflow) that transforms live source-configuration state (in Postgres) into a derived, versioned, immutable snapshot repository the agent's retrieval sandbox can load. The admin interface writes to Postgres; the orchestrator syncs; the agent reads the snapshot.
Canonical Vercel framing¶
From the Knowledge Agent Template pipeline:
"You add sources through the admin interface, and they're stored in Postgres. Content syncs to a snapshot repository via Vercel Workflow. When the agent needs to search, a Vercel Sandbox loads the snapshot."
(Source: sources/2026-04-21-vercel-build-knowledge-agents-without-embeddings)
Why this separation exists¶
Three deployments share a canonical shape:
- Live source-of-truth DB (Postgres).
- Derived snapshot repo — versioned, materialised.
- Per-request sandbox checkout of the snapshot.
The pattern names the producer that bridges the first two. The consumer side is patterns/bash-in-sandbox-as-retrieval-tool.
Four reasons the producer is a separate orchestrator, not inline with retrieval:
- Content materialisation is expensive. A GitHub repo clone, a YouTube transcript pull, an API pagination walk — none should happen on the agent's critical path.
- Source API rate limits. Admin DB scales with users; the snapshot sync scales with source APIs (GitHub, YouTube, etc.) whose limits are lower.
- Failure decoupling. A source API outage shouldn't take the agent down; the agent keeps serving the last good snapshot.
- Version discipline. The snapshot repo has a history, the admin DB has current state; you can't roll back the DB to debug a week-old agent answer — you can reload a week-old snapshot.
Mechanism shape¶
admin UI → writes → Postgres (live config)
│
│ (event or schedule)
▼
Vercel Workflow
│
│ (fan out per source)
▼
┌──────────────┼──────────────┐
▼ ▼ ▼
GitHub clone YouTube pull Markdown sync
└──────────────┼──────────────┘
│
│ (materialise into repo)
▼
snapshot repository
│
│ (pull at retrieval time)
▼
Vercel Sandbox
What this pattern solves vs vector-index refresh¶
A vector-index refresh is:
- All-or-nothing. Re-embedding happens as a batch; partial updates are awkward.
- Opaque. A diff between index version N and N+1 isn't human-readable.
- Lossy on top. The snapshot at the repo level is the ground truth; an index is a lossy transformation.
A snapshot repo sync is:
- Incremental. Changed sources get re-pulled; unchanged sources stay.
- Diff-inspectable. The repo history shows what changed.
- The ground truth. The agent reads it directly; no transformation layer.
What's undisclosed (Vercel 2026-04-21 post)¶
- Trigger model. Event-driven on admin writes? Scheduled? Both?
- Change detection. ETags, last-modified, content-hash, full re-pull?
- Partial-failure handling. One source fails; does the whole sync abort, or does the snapshot advance with the failed source marked stale?
- Parallelism. Sync fans out per source; how many concurrent source pulls?
- Rollback. The post implies versioning is available; the rollback UX / API is not named.
- Large-source strategy. A 10-GB GitHub repository — does it clone in full every time, use sparse checkout, or differentially fetch?
Composition with retrieval¶
The snapshot sync is write-path; retrieval is read-path. They share the repo. This is the classical materialised-derived-view shape from database engineering applied to agent infrastructure — Postgres is the OLTP source of truth; the snapshot repo is the derived-view consumed by the agent's retrieval tools.
Seen in¶
- sources/2026-04-21-vercel-build-knowledge-agents-without-embeddings — canonical Vercel-stack instance; Vercel Workflow as orchestrator, Postgres as source of truth, snapshot repo as the agent-facing derived view.
Related¶
- concepts/snapshot-repository-as-agent-corpus — the consumer-side framing of the snapshot.
- concepts/filesystem-as-retrieval-substrate — the retrieval choice that consumes the snapshot.
- patterns/bash-in-sandbox-as-retrieval-tool — the read-path complement.
- systems/vercel-workflow — canonical orchestrator.
- systems/vercel-sandbox — canonical consumer.
- systems/vercel-knowledge-agent-template — canonical composition.