SYSTEM Cited by 3 sources

Gemma¶

Gemma is Google's family of open-weight LLMs (currently up to Gemma3). First wiki mention: sources/2026-02-13-netflix-scaling-llm-post-training-at-netflix — cited as one of the modern supported families in Netflix's internal Post-Training Framework, alongside Qwen3, Qwen3 MoE, and GPT-OSS.

Hosted on Databricks FMAPI (2026-05-22)¶

Gemma 3 12B is served on the Foundation Model APIs with implicit prompt caching enabled — part of the 2026-05-22 GA rollout of caching to the open-weights model catalog (Source: sources/2026-05-22-databricks-accelerating-llm-inference-with-prompt-caching-for-open-source-models).

The 2026-06-02 Semantic IDs: Product Understanding at Scale post (Source: sources/2026-06-02-instacart-semantic-ids-product-understanding-at-scale) discloses Gemma as the off-the-shelf embedding model for the discovery flavor of Instacart's two-flavor Semantic IDs design. The upstream pipeline:

Gemini Flash extracts structured attributes (product type, key ingredients, dietary tags, format) from product text + strips marketing copy.
Gemma (off-the-shelf, "general-purpose embedding model") embeds the cleaned representation.
The Gemma embedding feeds the RQ-VAE quantizer.

The hypothesis the post tests: "a general-purpose model, given cleaner inputs, can capture nuances that a domain-specific model misses." Validated by LLM-based cluster evaluation: Gemma-derived codebooks score higher than the in-house ESCI model on thematic coherence, lower on substitutability — matching the intended discovery use case (homepage feeds, cross-selling, exploration). (See patterns/two-flavor-codebook-precision-vs-discovery + patterns/llm-attribute-extraction-before-embedding.)