SYSTEM Cited by 2 sources
Databricks AI Functions¶
Databricks AI Functions are SQL-callable LLM inference primitives
exposed as built-in functions (the canonical one is ai_query) inside
Databricks SQL / DataFrame / Structured Streaming. They run LLM calls
inline with table data — no separate model-serving infrastructure
required.
Stub page. Documented from a single ingested source so far; the operational profile and pricing details are not in scope.
Capabilities cited in ingested sources¶
ai_queryfor inference. Single SQL function that takes a model endpoint + prompt + (optionally) input columns and returns the model's response as a column.- Multimodal input. Image columns (e.g. rendered PDF pages stored in
Unity Catalog Volumes) can be passed
directly. The
MapAid groundwater
pipeline sends each scanned-page image through
ai_queryfor classification — no OCR-as-prerequisite. See patterns/visual-first-document-extraction. - Structured JSON output.
ai_queryenforces a response schema so the team can capture site name / GPS / depth / yield as typed columns even when the underlying scans embed those fields in different formats. Canonicalised in concepts/schema-constrained-llm-output. - In-SQL iteration. Because AI Functions are just functions inside SQL/DataFrames, prompt tuning + schema changes are iterated like any query refactor — no separate inference service to deploy.
Architectural role¶
In the MapAid groundwater pipeline AI Functions appear in three distinct stages:
- Classification pass —
ai_queryover sampled page images to produce Dewey Decimal codes, geographic tags, and a water-relevance flag. - Extraction pass —
ai_queryover per-page OCR'd text + schema-constrained output to emit JSON well/borehole records. - Judge pass —
ai_queryagainst a separate judge model to score each classification on accuracy / completeness / consistency. See patterns/llm-judge-as-inline-pipeline-stage.
The same primitive playing all three roles is the point: SQL-native inference + structured output makes the pipeline an SQL/DataFrame job, not a custom inference service. See patterns/sql-native-multimodal-llm-inference.
Seen in¶
-
sources/2026-05-13-databricks-the-rosetta-stone-of-cps-clarotys-ai-powered-library — Second canonical instance: Claroty's CSAF (Common Security Advisory Framework — JSON-formatted vulnerability advisories) → Delta-table ETL pipeline orchestrated by Lakeflow Jobs. "In this ETL, and in many more use cases, we use LLMs to enrich the data — from classification tasks and AI Functions like ai_query, using various Serving endpoints and MLflow to evaluate the answers we get from the LLM, using statistic metrics and LLM-as-a-judge, and monitor the cost." Same shape as the MapAid pipeline (ai_query for inline LLM enrichment + judge-pass for evaluation), different domain (security advisories vs groundwater PDFs). The LLM-as-a-Judge face here is conservative-ternary (pass / fail / unknown) and explicitly framed against the absence of fully-labelled ground truth in "real-world CPS data" — composes with patterns/llm-judge-as-inline-pipeline-stage for the step-by-step reliability scoring and vector-search-no- scale-to-zero for the cost-efficiency observation in bursty event-driven workloads. Endpoint heterogeneity is explicit:
ai_querycalls fan out across "various Serving endpoints" — not a single foundation-model dependency. -
sources/2026-05-11-databricks-unlocking-the-archives — canonical wiki instance. Three uses (classify / extract / judge) inside one pipeline; multimodal page-image inputs; schema-constrained JSON outputs; iteration-without-separate-infra explicitly called out as the architectural value prop.
Related¶
- systems/databricks
- systems/databricks-foundation-model-api
- systems/databricks-model-serving
- systems/delta-lake
- systems/unity-catalog
- systems/lakeflow-jobs
- systems/mlflow
- systems/claroty-cps-library
- concepts/schema-constrained-llm-output
- concepts/multimodal-document-understanding
- concepts/llm-as-judge
- concepts/vector-search-no-scale-to-zero
- patterns/sql-native-multimodal-llm-inference
- patterns/visual-first-document-extraction
- patterns/llm-judge-as-inline-pipeline-stage