PATTERN Cited by 1 source
Runtime-provisioned per-tenant search index¶
Intent¶
Make a dedicated search index per tenant (agent, customer, session, language, region, …) a runtime-cheap primitive — created on first appearance, destroyed on tenant eviction, configured independently — rather than a deploy-time schema decision. The goal is to collapse the distance between "this tenant needs isolated retrieval" and "this tenant has isolated retrieval" from weeks-of-ops to one API call.
Problem¶
The default multi-tenant search shape — one shared index with tenant_id as a filter field — is easy to stand up, but under agent workloads it breaks down:
- Policy-not-structure isolation. One forgotten
WHERE tenant_id = …leaks tenant A's data into tenant B's search results. - Global index statistics. BM25
avgdl+ idf, HNSW graph shape, chunk budgets, and reranker inputs are all influenced by the noisiest tenant's document shape, degrading quality for the others. - Blast radius. Reindex, schema change, corruption all hit every tenant at once.
- Rigid configuration. Tokenizer (
portervstrigram), match mode (ANDvsOR), fusion, reranking cannot vary per tenant. - Tenant deletion = scan-and-purge. A one-click tenant deletion is structurally impossible; it's a batch job with verification.
Pre-provisioning one index per tenant at deploy time fixes the isolation story but re-introduces a coordination problem — the set of tenants changes over time, re-deploys are expensive, and the cost model of a "provisioned forever" index per customer is untenable for thousands-of-customers or per-session granularity.
The one-to-one agent posture — which Durable Objects made cheap at the actor tier — makes the pre-provisioning approach especially untenable: one search index per durable-object session at tens-of-thousands-concurrent scale is not a deploy-time configuration.
Solution¶
Expose create() / delete() / list() / search() at the namespace level as a first-class platform primitive. Tenants get their own search index the way they get their own actor: on demand, at first request, cheaply.
Canonical wiki realisation: Cloudflare's ai_search_namespaces binding in the 2026-04-16 AI Search launch.
// wrangler.jsonc
{
"ai_search_namespaces": [
{ "binding": "SUPPORT_KB", "namespace": "support" }
]
}
// In the SupportAgent's onChatMessage:
try {
await this.env.SUPPORT_KB.create({
id: `customer-${customerId}`,
index_method: { keyword: true, vector: true }
});
} catch { /* instance already exists */ }
Idempotent creation, runtime lifecycle, per-instance configuration. Deletion is env.SUPPORT_KB.delete("customer-abc123") and purges all of that tenant's data.
Structural requirements¶
For the pattern to be cheap enough to be runtime, the platform must supply:
- Atomic instance provisioning — no cluster-edit, no re-deploy, no index-template handshake.
- Atomic instance deletion — one call, bounded time, data gone.
- Unified storage and index — no external bucket / pipeline to wire up.
- Low per-instance base cost — thousands to millions of instances per account has to be viable.
- Composable queries across instances (patterns/cross-index-unified-retrieval) — so the per-tenant decomposition doesn't fragment the app.
2026-04-16 AI Search open-beta limits (illustrative of the cost model): 100 instances/account (Free), 5,000 instances/account (Paid) — not per-deploy; runtime-varying.
Canonical shape — support agent (2026-04-16)¶
namespace: "support"
├── product-knowledge (shared, R2-backed)
├── customer-abc123 (per-tenant, managed storage)
├── customer-def456 (per-tenant, managed storage)
└── customer-ghi789 (per-tenant, managed storage)
- Shared instance: product docs, one-for-all, R2 as data source.
- Per-customer instances: resolution history (agent memory), built-in storage, created on first customer appearance.
- Cross-instance query fans across
product-knowledge+ the customer's own index in one call (patterns/cross-index-unified-retrieval).
Consequences¶
Pros¶
- Isolation is structural. Two tenants cannot collide in one result set because their data is in two indexes.
- Tenant deletion is one call. Data-residency, right-to-be-forgotten, customer-offboarding become trivial.
- Per-tenant configuration. Tokenizer, match mode, fusion, reranking all vary per tenant if the workload warrants.
- Per-tenant index statistics. BM25 parameters stay calibrated to the tenant's own corpus.
- Natural fit for one-to-one agents. Per-agent / per-session memory is structurally supported.
Cons / tradeoffs¶
- Cost model must support it. Per-instance base overhead must round to zero, otherwise the fleet cost scales with tenant count.
- Cross-tenant analytics harder. Aggregate queries across all tenants now require fan-out; single-index shape gave a built-in global view.
- Discovery + lifecycle policy. Creation-on-demand needs a tenant-ID source of truth; stale-tenant GC needs a policy (time-based, ref-counted).
- Per-instance parameter drift. Giving each tenant their own tokenizer/match-mode can silently produce divergent ranking behaviours; governance needed.
- Warm-up. Cold per-tenant instances have no query history for relevance-learning (though hybrid retrieval is less sensitive to that than learning-to-rank).
Seen in¶
- sources/2026-04-16-cloudflare-ai-search-the-search-primitive-for-your-agents —
ai_search_namespacesbinding; per-customer instance + shared product-knowledge instance; runtimecreate()/delete()/list()/search(); one-per-language feasibility.
Related¶
- concepts/per-tenant-search-instance — the structural unit this pattern provisions.
- concepts/one-to-one-agent-instance — the actor-tier counterpart; same economics bet extended to retrieval.
- concepts/agent-memory — canonical consumer.
- concepts/unified-storage-and-index — load-bearing platform property.
- systems/cloudflare-ai-search — canonical productised realisation.
- patterns/cross-index-unified-retrieval — composition primitive that keeps the decomposition ergonomic.
- patterns/upload-then-poll-indexing — write-side API shape that makes each instance write atomic.
- patterns/metadata-boost-at-query-time / patterns/native-hybrid-search-function — retrieval-tier patterns that compose inside each instance.