Skip to content

SYSTEM Cited by 3 sources

Oxla

Oxla is a C++-based distributed SQL query engine purpose-built for high-performance federated analytics, acquired by Redpanda and announced as a component of the Agentic Data Plane (ADP) on 2025-10-28. Positioned as ADP's SQL-engine piece alongside Redpanda Streaming (the log) and Redpanda Connect (the connectivity suite).

Source of announcement: Redpanda 2025-10-28 Governed autonomy.

Positioning

Announced across two coordinated 2025-10-28 Redpanda posts:

  • Introducing the Agentic Data Plane (Gallego, founder-voice) discloses Oxla's technical details (rpk oxla CLI, single binary, C++, PostgreSQL wire protocol, separated compute- storage, Iceberg-native) and early-preview timeline (mid-December 2025).
  • Governed autonomy (unsigned companion) frames Oxla's role in ADP as federated-query surface for agentic SQL.

Verbatim from the governance-framing companion post:

"Redpanda has acquired Oxla, a next-generation distributed SQL engine purpose-built for high-performance federated analytics. Oxla's C++-based engine will power low-latency, massively parallel, agentic SQL access across live streams and point-in-time data. This acquisition transforms the ADP into a truly agent-ready query platform, supporting materialized views for streaming transformations and federated queries spanning Apache Iceberg, Apache Kafka topics, and a broad suite of legacy data sources."

Verbatim from the product-launch post:

"Redpanda has acquired Oxla: a team obsessed with performance, correctness, and data catalogs. Oxla (rpk oxla) is a distributed query engine, single binary, written in C++, built for demanding Iceberg queries and merging real-time context with historical data. It is a PostgreSQL wire protocol engine with separated compute-storage, oriented to bring low-latency context management for agents looking to merge streams or large data sets, search, or simply filter aggregations in real time."

Architectural properties named across the two posts:

  1. C++-based single binary — performance substrate; contrasts with JVM-based query engines (Trino, Presto, Spark SQL) and Rust-based ones (DataFusion / InfluxDB IOx) on the memory- model and GC axes.
  2. Distributed / massively parallel — no node-count or cluster-shape disclosed in the announcement.
  3. PostgreSQL wire protocol — clients connect as if to Postgres; inherits the Postgres-driver ecosystem across every language.
  4. Separated compute and storage — the 2010s-canonical warehouse-engine property (Snowflake, BigQuery, Redshift RA3) carried forward into Oxla.
  5. Iceberg-native"built for demanding Iceberg queries"; first-class Apache Iceberg consumption rather than federation-over-Iceberg-via-connector.
  6. Federated query surface"federated queries spanning Apache Iceberg, Apache Kafka topics, and a broad suite of legacy data sources". Federates across live streams + point-in-time data in one query plane. See concepts/federated-vs-indexed-retrieval for the axis.
  7. Materialized views for streaming transformations — incremental-maintenance over streams; positions Oxla as continuous query engine, not just batch-SQL-over-Iceberg.
  8. rpk oxla — CLI-integrated into Redpanda's existing rpk tool (same family as rpk connect, rpk cloud).

Agentic SQL role in ADP

Oxla's role in ADP is to give agents a single SQL surface as their data-access universal interface:

"Agents can reason over unbounded, real-time datasets with warehouse-grade precision using SQL as their universal interface and access context through lightweight MCP servers."

The architectural claim: SQL is the agent's universal interface for data, so an agent-native query engine should federate across whatever data modalities the enterprise has, rather than forcing data-movement into a warehouse.

Mechanism not disclosed

The 2025-10-28 announcement is vision + acquisition disclosure, not a technical post. Not disclosed:

  • Query planner architecture (cost-based vs rule-based; optimiser depth).
  • Federation mechanism (wire protocol over Kafka topics; push-down discipline for Iceberg; connector model for legacy sources).
  • Consistency model across federated sources (read-committed? snapshot-isolation? eventual-consistency bounded-staleness?).
  • Materialized-view refresh strategy (eager vs lazy; incremental vs full).
  • Benchmarks (no throughput / latency / node-count numbers).
  • Licensing / open-source stance post-acquisition.
  • Integration with Iceberg REST catalogs.
  • Interaction with Redpanda Iceberg Topics (which already register topics as Iceberg tables — presumably Oxla consumes them as tables).

Pre-acquisition

Oxla pre-existed Redpanda's acquisition as an independent vendor at oxla.com. The 2025-10-28 post is the first wiki mention; no prior Redpanda corpus reference.

Pre-acquisition positioning (public web): Oxla was marketed as a distributed OLAP database compatible with PostgreSQL wire protocol, designed for federated analytics workloads. Wiki does not yet have a pre-acquisition technical ingest; this page starts as a stub post-acquisition.

Relationship to siblings

  • Redpanda Iceberg Topics is the topic-to-table side of the Iceberg integration (streaming broker writes Iceberg-compatible tables). Oxla is plausibly the query-side piece — reads the tables Iceberg Topics write. Announcement doesn't walk this composition, but the architectural shape fits.
  • systems/apache-iceberg is one target of Oxla's federated query surface.
  • systems/apache-kafka topics are one target of Oxla's federated query surface — allowing SQL over live Kafka streams without Kafka Streams / Flink / ksqlDB intermediate materialisation.
  • "Legacy data sources" — unnamed; plausibly existing Postgres / MySQL / S3 / HDFS connectors via Redpanda Connect's catalog, but not disclosed.

Caveats

  • Zero mechanism depth in the canonical wiki source. The announcement is acquisition framing; all architectural claims are headline-altitude.
  • No node-shape / cluster-model disclosure. "Massively parallel" without parallelism discipline named.
  • Consistency-model silence on federated queries across streams + Iceberg + legacy sources. Non-trivial to reason-about without disclosure.
  • Post-acquisition product trajectory unclear. Will Oxla remain standalone? Become ADP-only? Open-source pivot? All unspecified.
  • No benchmarks. "Low-latency, massively parallel, warehouse-grade precision" — zero numeric backing in the canonical wiki source.
  • First wiki ingest post-acquisition. Future ingests (Oxla engineering blog, if published under Redpanda; Redpanda Oxla technical deep-dives) will update this page.

Query manager rewrite (2026-01-27, Engineering Den)

First post-acquisition disclosure of Oxla internals: a rewrite of the query manager"the component responsible for the lifecycle of currently-running queries" (Source: Redpanda 2026-01-27 Engineering Den).

Pre-rewrite pathologies verbatim:

  • "Queries could get stuck in 'finished' or 'executing' while still holding onto resources."
  • "Different parts of the system disagreed about what was actually happening."
  • "A query might show as scheduled in one place and finished in another."
  • "To avoid deadlocks, the old code gathered running queries, spawned async work per thread, and sometimes had to retry cancellation from a different thread entirely." — canonicalised on the wiki as concepts/async-cancellation-thread-spawn-antipattern.

Rewrite substrate: a deterministic state machine per query, with every transition logged and explicit teardown at terminal states — the composed pattern canonicalises Oxla's rewrite as the wiki instance of query-lifecycle-manager as state machine.

Verbatim core claim: "The new scheduler is built as a deterministic state machine. At any point, it's in a known state, handling a specific event, and transitioning predictably. Every transition is logged." + "When a query finishes, the scheduler and executors are torn down cleanly. Finished queries stay finished. Canceled queries are accounted for. Nothing hangs around quietly consuming resources anymore."

Validation frame: "~25,000 queries on one- and three-node clusters" without reproducing prior pathologies; no throughput or latency benchmarks — reliability-first testing. Production rollout "within days" of 2026-01-27.

Debuggability payoff verbatim: "Bugs still happened, as they always do with new code, but they were much easier to track down. Being able to trace state transitions made fixes straightforward instead of exploratory." — issues "fixed in days instead of weeks".

Mechanism depth gap: the post is short engineering-diary voice (~600 words); no state diagram, no enumeration of states or events, no concurrency-model disclosure, no cancellation- protocol depth beyond "event dispatched to the state machine".

Seen in

Last updated · 470 distilled / 1,213 read