SYSTEM Cited by 2 sources

Figma AI Search¶

Definition¶

Figma AI Search is Figma's AI-powered search feature (shipped at Config 2024), combining two query modes over Figma's design-file corpus:

Visual search — query by screenshot, selected frame, or sketch (reverse-image-search-lineage): given a visual input, find similar existing designs.
Semantic search — query by natural-language text against component names, descriptions, and files even when the searcher doesn't know the exact component terminology (embedding-based semantic retrieval, not keyword match).

Surfaced inside Figma's Actions panel — the team's consolidated home for AI features — with result peek previews (for the narrower Actions-panel width) and CMD+Enter for full-screen result inspection (Source: sources/2026-04-21-figma-how-we-built-ai-powered-search-in-figma).

Why it exists¶

User research observed that 75% of all objects added to the Figma canvas come from other files — designers riff on existing work rather than starting from scratch. Previously, finding those prior designs required knowing organisational structure and authorship (what team / who / when) — a workflow tax that consumed creative flow and spawned "hundreds of messages in Slack with designers asking teammates for help."

Two research-identified blockers with the pre-AI-search state:

Discovery altitude gap. Designers had to choose between searching low-level components in design-system libraries or opening entire files. The useful middle — a frame combining several components — was unsearchable.
Thumbnail ambiguity. Without knowing whether a result has the right design or the latest version, designers had to open each file to check, rather than quickly scanning thumbnails.

How it was built (Figma's own narrative arc)¶

June 2023 hackathon and the RAG motivation¶

A three-day company AI hackathon produced 20 projects including a working prototype of design autocomplete — an AI assistant suggesting the next component (e.g. a "Get started" button mid- onboarding-flow). The engineering rationale for building AI search before shipping autocomplete was a RAG argument:

"We knew based on Retrieval Augmented Generation (RAG) that we can improve AI outputs with examples from search; if design autocomplete could find designs similar to what a designer was working on, it could better suggest the next component."

Autocomplete thus became the forcing function for AI search. Internal user-research on the autocomplete prototype revealed the 75%-from-other-files statistic, which redirected the priority: ship better search first; autocomplete waits.

This matches patterns/hackathon-to-platform with a twist — Figma's hackathon prototype (autocomplete) surfaced a different real user need (search), which got shipped first.

Use-case triage¶

Three core use cases, prioritised for creation over ideation:

Frame lookup — find a specific design to edit / riff on. Shipped first.
Frame variations — find different ways to design a particular piece of UI (prior explorations or existing approved-asset patterns).
Broad inspiration — find thematically similar concepts and different approaches as creative jumping-off points.

Indexing problem and its partial solutions¶

Indexing everything was framed as cost-prohibitive. Multiple heuristics stack to shrink the corpus (see patterns/selective-indexing-heuristics):

UI-frame shape heuristic — target top-level frames that look like UI designs, via common UI frame dimensions. Exception for non-top-level frames when they meet "the right conditions" (designers organise work in sections or nested frames).
Duplicate collapsing — designers riff via duplicate-and-tweak; index only one of each near-duplicate.
File-copy skipping — designers frequently copy files; skip indexing unaltered copies entirely.
Quality signals (experimental) — designs marked ready for development and similar explicit signals as a ranking / filtering input. The post is explicit this is still experimental, not a shipped mechanism.

Edit-quiescence indexing¶

Figma indexes only after a file hasn't been edited for four hours (patterns/edit-quiescence-indexing). Stated wins:

Unfinished work ("Graveyard" pages and WIP) stays out of results.
System load is reduced.

The trade-off — a few-hour staleness window — is judged acceptable because designers searching for inspiration / reuse are not querying just-now state. This is an opinionated freshness-vs-quality choice, not a pure latency optimisation.

Evaluation infrastructure on the product canvas¶

The eval set was seeded from internal-designer interviews + analysis of how people used Figma's file browser. Example query shapes from the post:

Simple: "checkout screen"
Descriptive: "red website with green squiggly lines"
Specific: "project [codename] theme picker"

To grade results, the team built a human-labeling tool inside Figma itself using the public plugin API:

Results displayed on an infinite canvas (Figma's own primitive).
Keyboard shortcuts for rapid correct / incorrect marking.
Historical views showing whether the search model had improved between runs.

Canonical patterns/visual-eval-grading-canvas — the product's own data model (infinite canvas + plugin API) serves as the evaluation UI, cutting internal-tooling cost and giving designers a familiar surface for labeling.

Similarity-tier bar¶

Product research established a non-obvious constraint (concepts/similarity-tier-retrieval): users start from something closer or more similar, even when ultimately seeking diverse results. The product rule:

"if we couldn't prove we could find the needle in the haystack, designers wouldn't trust the feature for broader exploration."

So the eval bar is not "good diverse-result quality in isolation" — it's exact-match quality + near-similar quality + diverse-result quality, all above a trust threshold simultaneously.

Design and surface decisions¶

Unified refinement interface across query types (created-by filters, file/author/recency metadata) rather than per-input-mode result pages.
Rabbit holing interaction (click a result to dive deeper into that type) was explored and scrapped for simplicity.
Actions panel as the AI-features home forced narrower result-surface → peek previews for quick inspection + CMD+Enter for full-screen drill-down.

Shipping principles (named)¶

Four post-ship principles from the Figma team:

AI for existing workflows — apply AI to tasks users already do (file browsing, copying frames into the current file), not net-new workflows.
Rapid iteration — continuously ship to staging + use insights from an internal beta to refine.
Systematic quality checks — custom evaluation tools (infinite-canvas eval plugin) to monitor and improve result accuracy over time.
Cross-disciplinary teamwork — product + content + engineering
research collaboration as a shipped-principle claim.

Known unknowns¶

The post is product-led, not systems-led. Silent on:

Embedding model family (text + image).
Vector store / ANN index identity, size, dimensionality.
Corpus size (number of frames / files indexed).
Retrieval latency, QPS, cost.
Ranker architecture + training pipeline.
Any NDCG / MRR / recall@K numbers.
Language coverage.
Hybrid retrieval (BM25 + vectors, as Dropbox Dash uses) unaddressed — the post doesn't state whether a lexical index participates.

(Updated 2026-04-21 — the infrastructure companion post answers most of these; see "Infrastructure" section below.)

Infrastructure¶

Answered by sources/2026-04-21-figma-the-infrastructure-behind-ai-search-in-figma (companion post to the product post above).

Embedding model — CLIP¶

OpenAI CLIP (open source, arXiv:2103.00020) — a multimodal embedding model producing text and image embeddings into the same vector space. One vector index serves both query modes; a text query "cat" and an image of a cat produce numerically similar embeddings.
Two model variants trained / fine-tuned:
Designs search — fine-tuned on UI images from public Figma Community files. Used for frame-level search.
Components search — "very similar" model, fine-tuned on publicly available Community UI kits. Used for Assets-panel search.
Explicit: no private Figma files or customer data used for training.
Rejected alternative: an early experiment embedded a textual JSON representation of the user's selection. Image embeddings produced better results and let screenshot-based queries share the same code path, so the JSON route was dropped.

Storage¶

DynamoDB — frame metadata and embeddings. Figma also runs its own RDS cluster but chose DynamoDB here because the workload "requires only a simple key-value store, writing and reading at high throughput. No transactions or foreign key relationships are required."
S3 — thumbnail renders uploaded per indexable frame.
OpenSearch k-NN — the vector-search index itself. OpenSearch was already deployed widely at Figma for traditional lexical search, so adding k-NN was the lower-friction option. Embeddings are written with additional metadata (frame name, file ID + name, containing project / team / org) to support faceted search (filters) alongside vector k-NN.

Inference — SageMaker¶

AWS SageMaker hosts the embedding models. Batched inference requests; input is a series of thumbnail URLs, output is a series of embeddings.
Batch-size sweet spot: "past some threshold we started to see latency growing linearly with batch size, instead of a sublinear batching effect." Inference-time parallelism explicitly engineered for both image download and image resize / normalisation inside the container.

Indexing pipeline — 4 discrete queued jobs¶

The pipeline is decomposed into discrete jobs queueing each other, following patterns/pipeline-stage-as-discrete-job. Figma's stated rationale: "Separating the individual steps of the pipeline into discrete jobs gives us more precise control over batching and retry behavior."

Identify + thumbnail. Run a headless server-side C++ build of the Figma editor in an async job to enumerate indexable frames (unpublished frames in a Figma file aren't otherwise enumerable). Persist frame metadata to DynamoDB. Render thumbnails and upload to S3. Enqueue next stage.
Generate embeddings. Send batches of thumbnail URLs to the SageMaker endpoint. Persist embeddings (KV store). Enqueue next stage.
Persist to index. Write embeddings + searchable metadata to OpenSearch k-NN. Terminal.

For published components, a similar pipeline fires on library publish, running against the components-search CLIP variant.

Query path — hybrid lexical + vector interleave¶

Both indexes queried simultaneously — the existing lexical fuzzy-match index and the new OpenSearch k-NN vector index.
Per-index scores aren't directly comparable → applied min-max normalization per index, boost exact lexical matches, then interleave by updated score. Pattern: patterns/hybrid-lexical-vector-interleaving.
Worked example: "mouse" returns the icon titled "Mouse" and cursor-adjacent icons.
Rollout intent: preserve lexical behaviour safely as vectors were added.

Two cost optimisations that dominated¶

Net framing: compute cost was driven "not by embedding generation, but by identifying and thumbnailing meaningful designs within Figma files." The embedding model was the obvious suspect; the frame enumeration + thumbnail path turned out to be where the money went.

Ruby → C++. Initial implementation serialized the whole Figma file as JSON and parsed that in Ruby — "extremely slow and memory intensive." Rewriting in C++, eliminating intermediate serialization, yielded "huge runtime improvements and memory reductions."
GPU → CPU rendering via llvmpipe. Moved thumbnailing off GPU on older AWS instances onto CPU-based llvmpipe software rendering on newer instance types. CPU instance types are cheaper and the newer generation is faster — workload completes in less time for less money. Inverts the default "GPU is always cheaper for rendering" assumption for bulk thumbnail workloads.
(also flagged) Edit-quiescence debounce (4h) reduced processable data to 12% — ~8× load reduction (patterns/edit-quiescence-indexing).
(also flagged) Cluster autoscaling on diurnal traffic.

OpenSearch index cost reductions¶

OpenSearch was the second-biggest cost (after frame enumeration + thumbnail). Two mitigations:

Corpus size cut in half by removing from the indexable set: draft files, within-file duplicate designs, and unmodified file copies. This is the quantified form of patterns/selective-indexing-heuristics. Flagged as a product improvement too — "not surfacing duplicate designs within files is a nice user experience improvement."
Vector quantization — by default OpenSearch k-NN stores each vector element as a 4-byte float; quantization compresses the representation for a "small reduction in nearest neighbor search accuracy."

Two candidly-reported OpenSearch bugs¶

Segment-replication replica non-determinism. End-to-end tests showed periodic non-determinism. Root cause: replica queries in OpenSearch returned different results than primary queries, traced to a Reader cannot be cast to class SegmentReader error in the delete path affecting replicas on clusters using segment replication. Partnered with the AWS OpenSearch team; fix shipped upstream in k-NN PR #1808. (Source: sources/2026-04-21-figma-the-infrastructure-behind-ai-search-in-figma)
_source slimming wipes embeddings on update. To save storage and query latency, Figma removed the embedding vector from OpenSearch's _source. Consequence: because OpenSearch uses _source to reconstruct updated documents, updates to any field (e.g. file name) silently wiped the embedding off the document. Fix: on update, re-fetch the embedding from DynamoDB and re-inject, preserving the _source optimisation on the read path. Canonical patterns/source-field-slimming-with-external-refetch.

Scaling driver — small teams force full-fleet indexing¶

Indexing cost matters more than marginal-user economics would suggest because of a workload shape:

"For even a single user to experience the search features as intended, all of their team's data must be indexed... Paradoxically, with even a small percentage of users onboarded, we'd quickly converge on having to index almost all teams at Figma — most of our teams are small and there are many of them!"

Net: backfill economics, not onboarding economics, drive the cost model — motivating every optimization above.

Relationship to Dropbox Dash search¶

Dropbox systems/dash-search-index and Figma AI Search share the retrieval-quality-caps-downstream-AI-quality chain. Divergences:

Corpus modality. Dash's primary modality is text documents/messages from third-party apps plus multimodal content via patterns/multimodal-content-understanding; Figma's is visual design frames — frames are first-class, text is a side signal.
Labeling pipeline shape. Dash uses patterns/human-calibrated-llm-labeling + DSPy prompt optimisation on an LLM judge (see systems/dash-relevance-ranker); Figma uses human-only labeling on an in-product infinite-canvas surface (this article describes no LLM judge). Complementary data points on how to stand up an AI-search eval pipeline.
Indexing-trigger policy. Dash ingests on content creation / update events per source; Figma adds an edit-quiescence buffer (patterns/edit-quiescence-indexing) — a freshness-vs-quality trade point Dash doesn't call out.
Hybrid retrieval. Both run BM25-shape lexical + dense-vector hybrid; Dash uses a learned ranker on top across multiple signals (systems/dash-relevance-ranker), Figma uses min-max-normalised interleave with exact-match boost across two OpenSearch indexes (one lexical, one k-NN). Both are canonical concepts/hybrid-retrieval-bm25-vectors instances.

Future work (stated)¶

Bring visual + semantic search to the Figma Community (public community content, not just per-team private).
Design autocomplete ship (the originally-hackathon-prototyped feature that motivated the search investment).