PATTERN Cited by 1 source

Runtime-provisioned per-tenant search index¶

Intent¶

Make a dedicated search index per tenant (agent, customer, session, language, region, …) a runtime-cheap primitive — created on first appearance, destroyed on tenant eviction, configured independently — rather than a deploy-time schema decision. The goal is to collapse the distance between "this tenant needs isolated retrieval" and "this tenant has isolated retrieval" from weeks-of-ops to one API call.

Problem¶

The default multi-tenant search shape — one shared index with tenant_id as a filter field — is easy to stand up, but under agent workloads it breaks down:

Policy-not-structure isolation. One forgotten WHERE tenant_id = … leaks tenant A's data into tenant B's search results.
Global index statistics. BM25 avgdl + idf, HNSW graph shape, chunk budgets, and reranker inputs are all influenced by the noisiest tenant's document shape, degrading quality for the others.
Blast radius. Reindex, schema change, corruption all hit every tenant at once.
Rigid configuration. Tokenizer (porter vs trigram), match mode (AND vs OR), fusion, reranking cannot vary per tenant.
Tenant deletion = scan-and-purge. A one-click tenant deletion is structurally impossible; it's a batch job with verification.

Pre-provisioning one index per tenant at deploy time fixes the isolation story but re-introduces a coordination problem — the set of tenants changes over time, re-deploys are expensive, and the cost model of a "provisioned forever" index per customer is untenable for thousands-of-customers or per-session granularity.

The one-to-one agent posture — which Durable Objects made cheap at the actor tier — makes the pre-provisioning approach especially untenable: one search index per durable-object session at tens-of-thousands-concurrent scale is not a deploy-time configuration.

Solution¶

Expose create() / delete() / list() / search() at the namespace level as a first-class platform primitive. Tenants get their own search index the way they get their own actor: on demand, at first request, cheaply.

Canonical wiki realisation: Cloudflare's ai_search_namespaces binding in the 2026-04-16 AI Search launch.

// wrangler.jsonc
{
  "ai_search_namespaces": [
    { "binding": "SUPPORT_KB", "namespace": "support" }
  ]
}

// In the SupportAgent's onChatMessage:
try {
  await this.env.SUPPORT_KB.create({
    id: `customer-${customerId}`,
    index_method: { keyword: true, vector: true }
  });
} catch { /* instance already exists */ }

Idempotent creation, runtime lifecycle, per-instance configuration. Deletion is env.SUPPORT_KB.delete("customer-abc123") and purges all of that tenant's data.

Structural requirements¶

For the pattern to be cheap enough to be runtime, the platform must supply:

Atomic instance provisioning — no cluster-edit, no re-deploy, no index-template handshake.
Atomic instance deletion — one call, bounded time, data gone.
Unified storage and index — no external bucket / pipeline to wire up.
Low per-instance base cost — thousands to millions of instances per account has to be viable.
Composable queries across instances (patterns/cross-index-unified-retrieval) — so the per-tenant decomposition doesn't fragment the app.

2026-04-16 AI Search open-beta limits (illustrative of the cost model): 100 instances/account (Free), 5,000 instances/account (Paid) — not per-deploy; runtime-varying.

Canonical shape — support agent (2026-04-16)¶

namespace: "support"
├── product-knowledge     (shared, R2-backed)
├── customer-abc123       (per-tenant, managed storage)
├── customer-def456       (per-tenant, managed storage)
└── customer-ghi789       (per-tenant, managed storage)

Shared instance: product docs, one-for-all, R2 as data source.
Per-customer instances: resolution history (agent memory), built-in storage, created on first customer appearance.
Cross-instance query fans across product-knowledge + the customer's own index in one call (patterns/cross-index-unified-retrieval).

Consequences¶

Pros¶

Isolation is structural. Two tenants cannot collide in one result set because their data is in two indexes.
Tenant deletion is one call. Data-residency, right-to-be-forgotten, customer-offboarding become trivial.
Per-tenant configuration. Tokenizer, match mode, fusion, reranking all vary per tenant if the workload warrants.
Per-tenant index statistics. BM25 parameters stay calibrated to the tenant's own corpus.
Natural fit for one-to-one agents. Per-agent / per-session memory is structurally supported.

Cons / tradeoffs¶

Cost model must support it. Per-instance base overhead must round to zero, otherwise the fleet cost scales with tenant count.
Cross-tenant analytics harder. Aggregate queries across all tenants now require fan-out; single-index shape gave a built-in global view.
Discovery + lifecycle policy. Creation-on-demand needs a tenant-ID source of truth; stale-tenant GC needs a policy (time-based, ref-counted).
Per-instance parameter drift. Giving each tenant their own tokenizer/match-mode can silently produce divergent ranking behaviours; governance needed.
Warm-up. Cold per-tenant instances have no query history for relevance-learning (though hybrid retrieval is less sensitive to that than learning-to-rank).

Seen in¶

sources/2026-04-16-cloudflare-ai-search-the-search-primitive-for-your-agents — ai_search_namespaces binding; per-customer instance + shared product-knowledge instance; runtime create()/delete()/list()/search(); one-per-language feasibility.

concepts/per-tenant-search-instance — the structural unit this pattern provisions.
concepts/one-to-one-agent-instance — the actor-tier counterpart; same economics bet extended to retrieval.
concepts/agent-memory — canonical consumer.
concepts/unified-storage-and-index — load-bearing platform property.
systems/cloudflare-ai-search — canonical productised realisation.
patterns/cross-index-unified-retrieval — composition primitive that keeps the decomposition ergonomic.
patterns/upload-then-poll-indexing — write-side API shape that makes each instance write atomic.
patterns/metadata-boost-at-query-time / patterns/native-hybrid-search-function — retrieval-tier patterns that compose inside each instance.