Skip to content

SYSTEM Cited by 1 source

Pydantic

Pydantic is a Python data-validation library that uses type hints to generate runtime validators and parsers. A BaseModel subclass with type-annotated fields yields (a) a JSON schema, (b) a parser that produces typed Python objects from JSON/dict inputs while raising clear ValidationErrors on schema violations, and (c) a serializer back to JSON.

In the LLM-tooling era Pydantic has become the de facto contract surface between LLM components: the application declares a Pydantic model for the expected output, passes the derived JSON schema to the model (via OpenAI's response_format / Anthropic's tool-use / generic structured-output prompting), and treats the returned dict as "parseable or fully incorrect" (Source: concepts/structured-output-reliability).

Why it matters for LLM pipelines

Free-form LLM text is unreliable as input to programmatic consumers — a missing brace, a stray comment, a re-ordered key all make the output unusable. Pydantic collapses the reliability-of- parsing problem into a single boundary:

class TranslationCandidate(BaseModel):
    text: str

class DrafterOutput(BaseModel):
    candidates: list[TranslationCandidate]

The LLM returns a JSON string; DrafterOutput.model_validate_json(...) either yields a typed object or raises. Downstream code never sees raw LLM text.

Seen in

  • sources/2026-02-19-lyft-scaling-localization-with-ai — Lyft's AI localization pipeline uses Pydantic schemas as the contract between the Drafter and Evaluator agents (DrafterOutput, TranslationCandidate, EvaluatorOutput, CandidateEvaluation, Grade enum, best_candidate_index: int). "This ensures type safety, reliable parsing, and clear contracts between Drafter and Evaluator." Canonical wiki instance of Pydantic-as-LLM-contract-surface. See systems/lyft-ai-localization-pipeline.
Last updated · 319 distilled / 1,201 read