Skip to content

Redpanda

Redpanda (originally Vectorized, rebranded 2021) is a streaming- platform company whose flagship product is a C++ rewrite of a Kafka-API-compatible broker built on the thread-per-core Seastar framework. The company blog (redpanda.com/blog) covers a mix of product announcements, benchmarks, tutorials, and occasional first-principles technical explainers on streaming-broker internals.

Tier classification

Tier 3 on the sysdesign-wiki. The blog is a mix of: - Product PR and launch announcements ("Redpanda 24.3 extends...", "Announcing Redpanda Cloud") — skip unless they disclose real architectural content. - Consultative / industry tutorials ("What is real-time data processing", "Create a real-time analytics pipeline") — skip unless they cover distributed-systems internals, scaling trade-offs, production incidents. - First-principles substrate explainers (e.g. this batch-tuning series by James Kinley) — ingest; these are the Kafka-API substrate posts that the wiki's Redpanda + Kafka coverage depends on. - Company-culture / hackathon / sales posts — skip.

Apply the generic tier-1 filter: skip unless the post explicitly covers distributed-systems internals, scaling trade-offs, infrastructure architecture, production incidents, storage / networking / streaming design.

Key systems

  • systems/redpanda-cloud-topics-metastoreDisclosed 2026-06-09: a custom Raft-replicated LSM-tree key-value store that maps Kafka offsets to L1 object-storage byte ranges for Cloud Topics. Pluggable persistence: SSTables on object storage (write-through cached), manifest via Raft for fast failover, WAL = Raft log. Enables Whole Cluster Restore and read-replica bootstrap from object storage alone (RPO ≤ 10 min default flush).
  • systems/redpanda-sqlGA in BYOC AWS, 2026-05-27: the productised face of Oxla (acquired 2025-10-28) as the third pillar of the Redpanda Data Platform. Postgres- wire MPP query engine running inside the BYOC cluster, in the customer's VPC, querying live topics + Iceberg cold tier in place via the Iceberg Topics substrate. Four GA properties: in-cluster + Postgres-wire + transparent-two-tier-bridge + ad-hoc-not-predefined. Activation is three steps with no broker restart. Closes the analytical- compute gap in BYOC's compliance / no-egress story; the data- residency invariant (concepts/byoc-data-ownership-for-iceberg) now extends from storage to analytical compute. Headline workload: agent-driven query fan-out (humans serial / agents parallel); explicit ksqlDB foil at the ad-hoc vs predefined axis; reframes the Redpanda Data Platform from streaming vendor into complete data-platform vendor "one architecture, one operational model, one vendor". GCP BYOC + BYOVPC + Self-Managed are roadmap.
  • systems/redpanda — the streaming broker itself. Kafka-API- compatible, C++ / Seastar / Raft.
  • systems/redpanda-shadowing — 25.3 broker-native cross-region DR feature: byte-for-byte, offset-preserving hot-standby clone of a source cluster in a second region. Replaces MirrorMaker2 and the prior Redpanda Migrator for Redpanda-to-Redpanda DR. RPO/RTO in seconds, bounded by client-timeout settings.
  • systems/redpanda-cloud-topicsGA in Redpanda Streaming 26.1 (2026-03-30 deep-dive; beta in 25.3 preview) per-topic object-storage-native topic class within a single cluster: metadata via Raft in-broker (through a placeholder batch), data straight to S3 / ADLS / GCS. Write path uses a Cloud Topics Subsystem that batches records across all partitions and topics for a short window ("e.g., 0.25s or 4 MB") into a single L0 file, then a background Reconciler rewrites L0 into per-partition, offset-sorted L1 files optimised for historical reads. Read path branches on a per-partition Last Reconciled Offset. Eliminates cross-AZ replication bandwidth cost for latency-tolerant workloads (observability, compliance, model training). Positioned against Confluent's Kora + WarpStream multi-cluster shape. Canonicalises patterns/object-store-batched-write-with-raft-metadata + patterns/background-reconciler-for-read-path-optimization. 2026-05-05 production-tuning retrofit (per the Little's Law in practice post): the initial write pipeline as shipped per the architecture deep-dive was throughput-pinned at ~1 RPS per producer connection because the upload phase's ~100× latency multiplier vs the prior NVMe-Raft replication was not absorbed by enough in-flight concurrency. Fix: an extra concurrency-buffer queue at the upload phase (patterns/concurrency-buffer-stage-for-high-latency-io), sized via Little's Law (Throughput = Latency × Concurrency), with order restoration after the slow stage and producer ack held until metadata is replicated. OMB-validated at GB/s scale "without needing to change any producer configurations." Also canonicalises patterns/pipelined-produce-with-position-guarantee as the pre-existing Redpanda technique the upload-queue retrofit composes on top of.
  • systems/redpanda-operator — the Kubernetes Operator for Redpanda cluster lifecycle management. As of v25.1.x (May 2025) a single unified operator serving both Redpanda Cloud (internal fleet) and customer Self-Managed deployments (patterns/unified-operator-for-cloud-and-self-managed); canonical wiki instance of a vendor retreating from a bundled-GitOps- dependency (FluxCD) and adopting a version-aligned compatibility scheme (operator version = Redpanda core version).
  • systems/redpanda-connect — the ~300-connector Kafka-Connect-alternative integration layer, open-sourced from Benthos. Canonical MCP-tool surface as of 2025-04-03 via rpk connect mcp-server.
  • systems/redpanda-connect-dynamic-plugins — Beta (2025-06-17, v4.56.0, Apache 2.0) dynamic-plugin framework in Redpanda Connect: plugins run as separate OS subprocesses communicating with the host over gRPC on a Unix domain socket, breaking the previous Go-only, compile-into-the-binary constraint. Go and Python SDKs at launch; canonical wiki instance of patterns/grpc-over-unix-socket-language-agnostic-plugin and the patterns/compiled-vs-dynamic-plugin-tradeoff.
  • systems/redpanda-connect-oracle-cdcsixth-engine Oracle CDC input (oracledb_cdc) in Redpanda Connect v4.83.0 (2026-04-09, enterprise-gated). Rides on Oracle LogMiner; canonical wiki instance of in-source checkpointing (fourth offset-durability class), precision-aware NUMBER mapping via ALL_TAB_COLUMNS + Schema Registry, and Oracle Wallet auth (canonical first wiki instance of file-based credential store). Completes the Redpanda Connect CDC family to six source-database engines (Postgres / MySQL / MongoDB / Spanner / MSSQL / Oracle). Competitive foil: single Go binary vs JVM + Kafka Connect cluster + Debezium.
  • systems/redpanda-byoc — Bring Your Own Cloud deployment model. Data plane runs inside the customer's VPC; Redpanda operates the control plane. Canonical tenet: Data Plane Atomicity — no runtime dependency on externalised services in the write path.
  • systems/redpanda-cloud — Dedicated managed-cluster peer to BYOC: Redpanda runs both control plane and data plane inside Redpanda's infrastructure on the customer's chosen hyperscaler (AWS/GCP/Azure). Canonical wiki instance of cell-based architecture at the streaming-broker altitude — each customer gets an isolated cluster with no external-metadata critical-path dependencies. 99.99% availability SLA / ≥99.999% measured SLO on multi-AZ; replication-factor ≥3 enforced; NVMe- primary + object-storage-tiered secondary; redundant Kafka API / Schema Registry / HTTP Proxy; feedback-control-loop- monitored phased rollouts; continuous chaos + load testing; one customer-elected critical-path exception — GCP Private Service Connect (or AWS PrivateLink equivalent). Canonical production-incident retrospective: sources/2025-06-20-redpanda-behind-the-scenes-redpanda-clouds-response-to-the-gcp-outage|2025-06-12 GCP-outage retrospective — fleet of "hundreds of clusters" survived the global GCP outage with a single materially-affected cluster (staging in us-central-1, ~2h replacement-node latency).
  • systems/redpanda-agents-sdk — 2025-04-03 preview toolkit for enterprise AI agents: Python SDK (durable execution, OpenTelemetry, Pydantic/OpenAI-agents ergonomics) + rpk connect mcp-server + rpk connect agent. "Ruby-on-Rails experience for agents."
  • systems/redpanda-agentic-data-plane2025-10-28 productization (Gallego founder-voice announcement) packaging Redpanda streaming + Oxla query engine + systems/redpanda-connect 300+ connectors + a new global governance/observability layer as "a unified runtime and control plane that safely exposes enterprise data to AI agents". First shipped governance feature: Remote MCP + authentication + authorization for OBO workloads with IdP integration (patterns/on-behalf-of-agent-authorization).
  • systems/oxlaacquired 2025-10-28: C++ distributed query engine with PostgreSQL wire protocol, separated compute-storage, and Iceberg-native workload targeting. rpk oxla CLI integration. Early preview mid-December 2025. Positioned as the SQL-filter-then-model-summarize substrate for agent context management.
  • systems/redpanda-iceberg-topics — topic-level integration with Apache Iceberg: a single logical entity that is both a Kafka-protocol topic and an Iceberg table. GA in Redpanda 25.1 (2025-04-07) across AWS, Azure, and GCP with nine disclosed GA-grade properties (custom hierarchical bucketed partitioning, built-in DLQ, Iceberg-spec schema evolution, automatic snapshot expiry, REST catalog sync via OIDC+TLS, transactional writes, automatic table discovery, object-store catalog fallback, tunable workload management). Canonical wiki instance of the streaming-broker-as-lakehouse-Bronze-sink + broker-native catalog registration patterns; Bronze tier of a Medallion-architected lakehouse without external ETL.
  • systems/openmessaging-benchmark — the open-source benchmark framework Redpanda uses (with tc-injected inter-broker latency) for multi-region stretch-cluster performance testing.
  • systems/redpanda-connect — Redpanda's Kafka-Connect alternative integration layer, shipping a family of per-engine CDC input connectors (postgres_cdc, mysql_cdc, mongodb_cdc, gcp_spanner_cdc) as the flagship source class. Canonical wiki differentiator vs Debezium: parallel snapshot of a single large table.
  • systems/openssl — the validated cryptographic module substrate Redpanda embeds for broker-level FIPS 140 compliance via the redpanda-fips + redpanda-rpk-fips packages (self-managed RPM / Debian only at publication).

Key patterns / concepts

Recent articles

  • 2026-06-18 — Adaptive write request scheduling in Redpanda's Cloud Topics — Describes the write-request scheduler that dynamically adjusts upload parallelism across CPU shards using a buddy-allocator algorithm. Key disclosure: per-shard batching costs $120K/year in PUT requests for a 5-broker cluster; the adaptive scheduler reduces this to ~$3,750/year at low load while scaling throughput under heavy load.
  • 2026-06-09 — Cloud Topics: the Metastore — Part of the Cloud Topics deep- dive series. Discloses the metastore: a custom Raft- replicated LSM-tree key-value store that maps Kafka offsets to L1 object-storage byte ranges. Key architectural disclosure: pluggable persistence layers — SSTables on object storage (write-through cached), manifest replicated via Raft for fast failover, WAL replaced by the Raft log itself. Also details how the metastore enables Whole Cluster Restore from object storage alone and read-replica bootstrap without source- cluster connectivity. Default flush interval = 10 min → RPO ≤ 10 min. Extends systems/redpanda-cloud-topics with the metadata tier architecture that prior posts (architecture, Little's Law, L0 GC) left as hand-waves.
  • 2026-06-02 — How OmniNode uses Redpanda to scale AI agent workflowsTier-3 guest post by OmniNode founder Jonah Gray on the migration of OmniNode's multi-agent coordination bus from Redis Streams to Redpanda. Tier-3 inclusion gate: passes scope on the contract-driven topic-naming discipline disclosure (a reusable, broker-agnostic pattern), not on the Redpanda angle per se. Migration trigger framed as coordination, not throughput: "we outgrew Redis Streams not because of throughput, but because coordination itself became difficult." At 5 → 12 repos / 100+ event types, the team needed consumer groups, partition-level parallelism, durable replay semantics, topic introspection, and programmatic provisioning — capabilities the Kafka model offers natively. Redpanda chosen for the single-binary affordability: "Kafka-API compatibility in a single binary" lets the broker boot in a single container, fit into an 8 GB development profile, and use the same compose file in local dev, CI, dev containers, and the homelab runtime — "if the broker is operationally heavy, teams eventually stop running it locally. They fake the bus, mock topic creation, or maintain a second development path that doesn't actually validate topic identity." This is the wiki's first canonical disclosure of broker affordability as a discipline-enabling property. The post's load-bearing architecture content is OmniNode's, not Redpanda's: the contract-driven topic-naming discipline (per-node contract.yaml declares subscribe_topics: / publish_topics:; topic names follow regex-validated shape onex.{kind}.{producer}.{event}.v{N}; StrEnum-backed canonical registry); the named bug class (silent wiring failurerouting-complete vs routing_complete accepted by the broker as different topics, both sides green, no events flow, "the silence was the failure mode"); the disclosed solution topology — single ContractTopicExtractor parser running in three independent places (CI / runtime boot / post-boot validation) per patterns/single-extractor-multi-call-site; a deliberately narrow-scope provisioner (creates missing topics only, no reconciliation of partition counts / replication / retention — "creation is contract-owned; reconciliation is a different problem"). Second strand: cheapest-capable model routing (concepts/cheapest-capable-model-routing) with auto-escalation on quality failure (patterns/auto-escalation-on-quality-failure) and an audit-grade routing receipt (concepts/routing-receipt) per decision (model / tokens / cost / compliance-pass). Disclosed week-of metrics: 75% of tokens never left the building (four on-prem hosts at zero marginal cost), $3.37 cloud spend avoided vs $2.43 actually spent, 1.3% of delegations escalated to a stronger model. Architectural through-line connecting both strands: single source of truth, validated, with no second hidden copy — "the same discipline that keeps topic names from drifting — one canonical source, validated, with no second hidden copy — is what lets me hand work to the cheapest model without hoping it went well. The decision is a contract. The receipt is the evidence. Neither lives in someone's head." Created: 1 source (sources/2026-06-02-redpanda-how-omninode-uses-redpanda-to-scale-ai-agent-workflows)
  • 2 systems (systems/omninode, systems/redis-streams)
  • 5 concepts (concepts/topic-name-as-coordination-surface, concepts/silent-wiring-failure, concepts/contract-yaml-as-bus-surface, concepts/regex-plus-enum-validation, concepts/cheapest-capable-model-routing, concepts/routing-receipt) + 3 patterns (patterns/contract-driven-topic-provisioning, patterns/single-extractor-multi-call-site, patterns/auto-escalation-on-quality-failure). Extended: systems/redpanda (single-binary affordability framing in Seen-in), systems/kafka (the API OmniNode codes against — wiki's first canonical "100-topic / 12-repo coordination surface" disclosure), systems/redis (Redis Streams scale ceiling). Caveats: broker-side scale numbers (partition counts, replication factor, multi-AZ topology) undisclosed; contract.yaml schema beyond the bus-surface block undisclosed; reconciliation acknowledged as next gap and unbuilt; routing- receipt storage substrate undisclosed; quality-bar calibration mechanism only sketched.

  • 2026-05-27 — Redpanda SQL is GA: the query engine that skips the pipelineGeneral Availability of Redpanda SQL for Redpanda BYOC AWS customers on consumption-based plans. The productisation of Oxla (the C++ MPP query engine acquired 2025-10-28) into the third pillar of the Redpanda Data Platform — "Streaming, Connect, and SQL". The acquisition → mid-December 2025 preview → 2026-05-27 GA arc is complete; Oxla brand surfaces only in the activation flow. Four GA properties: (1) In-cluster, in-VPC (concepts/in-cluster-streaming-sql / patterns/in-vpc-query-engine-on-streaming-substrate) — runs on the same BYOC infrastructure as the brokers and Iceberg storage, "every query accesses data in-place". Closes the analytical-compute gap in BYOC compliance / no-egress story. (2) Postgres wire protocol (concepts/postgres-wire-protocol-as-streaming-sql-surface) — "It's just Postgres." Connect with psql, DBeaver, DataGrip; no new drivers, no new SDK, no new query language. Same ecosystem-inheritance move Redpanda made on the broker side via Kafka wire protocol. (3) Transparent two-tier query bridge (concepts/two-tier-stream-iceberg-query-bridge / patterns/transparent-hot-cold-tier-query) — single SQL statement reads transparently across the live broker tier and the Iceberg Topics cold-tier Parquet files; engine plans the unified read path. Substrate- dependent on Iceberg Topics' simultaneous-write property. (4) MPP execution"Massively Parallel Processing" C++ engine, same implementation language as Redpanda Streaming. Five workload classes: streaming-app debugging (SELECT * FROM orders WHERE status = 'failed' AND timestamp > NOW() - INTERVAL '30 minutes' "in seconds"), real-time operational analytics (fraud / recommendations / leaderboards), ad-hoc analytics, compliance / data-residency-bound queries, and the headline agent-driven query fan-out (concepts/agent-driven-query-fan-out) — "hundreds of queries simultaneously: comparing time windows, validating patterns, exploring hypotheses in parallel." Explicit foil against ksqlDB at the ad-hoc vs predefined axis: "ksqlDB is a handy tool, but it requires you to decide what questions you're going to ask before the events arrive." Activation: three steps, no broker restart from the Redpanda Console cluster overview page. GA scope narrow: AWS BYOC consumption-plan only; GCP BYOC + BYOVPC: "coming soon"; Self-Managed: 2H FY27. Reframes the Redpanda Data Platform from a streaming vendor into a complete data-platform vendor: "One architecture. One operational model. One vendor." — the positioning answer to Confluent's Kora + Flink + Tableflow and to Kafka + ETL + Snowflake. No quantitative numbers (no latency p-values, no throughput benchmarks, no comparison vs Snowflake / Databricks / Trino). Mechanism depth of the two-tier read-path bridge intentionally light; ksqlDB explicitly named as the foil; SQL feature coverage / cross-tier transactionality / DML / DDL semantics not disclosed. Created: 1 source (sources/2026-05-27-redpanda-redpanda-sql-is-ga-the-query-engine-that-skips-the-pipeline)

  • 1 system (systems/redpanda-sql) + 5 concepts (concepts/in-cluster-streaming-sql / concepts/two-tier-stream-iceberg-query-bridge / concepts/postgres-wire-protocol-as-streaming-sql-surface / concepts/agent-driven-query-fan-out / concepts/ad-hoc-vs-predefined-streaming-sql) + 2 patterns (patterns/in-vpc-query-engine-on-streaming-substrate / patterns/transparent-hot-cold-tier-query). Extended: systems/oxla (GA productisation face); systems/redpanda (Redpanda SQL section); systems/redpanda-byoc (Redpanda SQL on BYOC face — extends data-residency from storage to analytical- compute tier); systems/redpanda-iceberg-topics (in-cluster SQL query surface as fourth access path); concepts/zero-etl-operational-analytical (second canonical instance — streaming-broker-substrate variant); concepts/iceberg-topic (SQL-readable face).

  • 2026-05-19 — Cloud Topics: Level Zero garbage collectionPart 1 of 2 (~1,100 words) on the decision mechanism for reclaiming Cloud Topics' L0 objects (the temporary, mixed-partition object-storage files produced by the cross-partition write batch). Sequel to the [[sources/2026-03-30-redpanda-under-the-hood-redpanda-cloud-topics-architecture|2026-03-30 Cloud Topics architecture deep-dive]] and continuing the scale-test-driven mechanism-disclosure series after the [[sources/2026-05-05-redpanda-littles-law-in-practice-with-cloud-topics|2026-05-05 Little's Law]] write-pipeline retrofit. Structured as a guided rejection of distributed reference counting ("this framing belies an ocean of complexity") followed by the disclosed mechanism: a coarse-grained logical timestamp (the cluster epoch) embedded in every L0 object ID at creation; per-partition sliding-window epoch tracking in a dedicated replicated state machine in the partition's Raft log ([max_applied_epoch, previous_applied_epoch, min_epoch_lower_bound]); clusterwide safe-to-GC epoch M = min(M(p)) computed lazily from per-partition watermarks piggybacked on Redpanda's existing periodic metadata- dissemination service. Architectural slogan: "No central index, no shared state, and no coordinated updates." Monotonicity makes stale observations always conservative-safe ("once we prove some M is safe, it never becomes unsafe. Every epoch < M is gone forever. Or until int64 rollover") — eventual consistency of M(p) dissemination is enough. 3 new canonical wiki concepts + 3 new patterns: (concepts/cluster-epoch — the named monotonic-counter primitive embedded in object IDs; concepts/epoch-based-distributed-gc — the GC technique itself, structurally distinct from reference counting, with the explicit reference-counting-rejection framing; concepts/sliding-window-epoch-tracking — the per-partition RSM relaxation that decouples window-advance from safe-watermark-advance for leadership-change tolerance) + (patterns/epoch-stamp-on-object-id-for-gc — embed the epoch in the durable identifier so the GC decision is local-only; patterns/per-partition-rsm-for-gc-tracking — embed the GC state machine in the shard's existing Raft log so it inherits durability, fencing, and atomicity-with-write-path for free; patterns/lazy-aggregate-from-monotonic-local-state — compute global min/max aggregates of per-shard monotonic watermarks via existing dissemination, with monotonicity making stale observations always conservative-safe). Extends 2 pages: systems/redpanda-cloud-topics (+full new "L0 garbage collection (2026-05-19)" section with verbatim quotes, the architecture-at-a-glance ASCII diagram, the reference-counting-rejection framing, the three-field RSM table, the cluster-epoch primitive, the lazy-aggregation property, and an explicit list of what Part 2 of 2 will cover; +9 tags; +new source; +9 related links); concepts/garbage-collection (+third canonical instance alongside Magic Pocket blob GC and LSM tombstone GC; +new "Reference counting vs epoch-based — the canonical distributed-GC choice" section with the side-by-side trade-off table; +new source; +9 related links). Canonical load-bearing disclosures: (1) Reference counting explicitly rejected for distributed L0 GC — verbatim "Redpanda does not track L0 objects this way." (2) Cluster epoch as load-bearing primitive — verbatim "The cluster epoch is a monotonically increasing counter that we embed in every L0 object ID at creation time. Since the epoch is updated periodically and only ever increases, any given epoch E must eventually age out of the cluster. Once we have reconciled every object created in epoch E, it stands to reason that any L0 object with that epoch can be safely deleted." (3) Per-partition M(p) aggregated to M — verbatim "Given partitions P={p0,…pn} and safe epochs Ms = min(M(p) over p in P), it follows that M = min(Ms) is safe to GC by epoch monotonicity and the definition of min." (4) Single-epoch tracking rejected on leadership-change failure mode — verbatim "If partition leadership moves to a node with a stale epoch cache, we'll reject every new write until cache expiry, which could be minutes away. Not ideal." (5) Sliding window in dedicated RSM — verbatim "Each Cloud Topic partition maintains this sliding epoch window through a dedicated replicated state machine embedded in the partition's Raft log." (6) Window slide ≠ safe-watermark advance — verbatim "the window itself slides forward as soon as a new epoch appears, [but] we only advance the safe epoch once we're sure all the L0 data up to that point has been reconciled into L1." (7) Piggyback existing metadata dissemination — verbatim "we can piggyback this information on an existing periodic metadata-dissemination service internal to Redpanda." (8) Monotonicity gives stale-data tolerance for free — verbatim "If a node is temporarily operating on stale metadata, that's fine. A nice side effect of epoch monotonicity is that once we prove some M is safe, it never becomes unsafe. Every epoch < M is gone forever. Or until int64 rollover." (9) Architectural slogan — verbatim "No central index, no shared state, and no coordinated updates." (10) Part 2 forward-reference"Stay tuned for part 2, where we discuss how the garbage collector's design enables us to continually delete thousands of L0 objects without any locally persistent state, explicit coordination, or wasted work." Cross-source continuity: third post in the Cloud-Topics-mechanism-disclosure series — 2026-03-30 architecture deep-dive (write/read paths) → 2026-05-05 Little's-Law write-pipeline retrofit → 2026-05-19 L0 GC decision mechanism. Together they form the most complete public mechanism description of Cloud Topics on the wiki.

  • 2026-05-05 — Little's Law in practice with Cloud TopicsCloud Topics performance-tuning retrospective (~750 words) and the first wiki source ingest where Little's Law appears as the load-bearing architectural lens for a fix, not as a side-note diagnostic. Sequel to the [[sources/2026-03-30-redpanda-under-the-hood-redpanda-cloud-topics-architecture|2026-03-30 Cloud Topics architecture deep-dive]]: where the architecture post described the L0-file / placeholder-batch / Reconciler primitives, this post discloses a write-pipeline issue that surfaced during internal benchmarking. Substituting object- storage upload (~1 s worst-case) for the prior NVMe-Raft replication phase (~10 ms) was a 100× latency multiplier; per-connection throughput collapsed from ~100 RPS to ~1 RPS by Little's Law. The fix: a single extra queueing stage at the upload phase, sized to provide in-flight concurrency for the slow stage, with producer-order restoration after the queue before metadata replication. Validated on OpenMessaging Benchmark at GB/s scale "without needing to change any producer configurations." 3 new canonical wiki concepts + 2 new patterns: (concepts/littles-law — first wiki canonical page for the foundational queueing-theory law in its application-friendly form Throughput = Latency × Concurrency, promoting it from passing references in concepts/queue-length-vs-wait-time and concepts/latency-rises-before-throughput-ceiling to a first-class wiki concept; concepts/storage-bottleneck-migration — the meta-observation that I/O bottlenecks migrate predictably across storage hardware generations (HDD → NVMe → object storage) and each migration invalidates the design assumptions of systems built for the previous era — Redpanda's verbatim "Just as older systems had to confront the introduction of high-performance storage, Redpanda isn't immune either"; concepts/queue-depth-as-latency-hiding-mechanism — the specialisation of Little's Law in which a queue is added to a pipeline stage not to absorb temporary arrival-rate bursts but specifically to inflate the in-flight concurrency at a high-latency stage, distinct from burstiness-absorbing queues on the sizing-discipline + operational-signal axes) + (patterns/concurrency-buffer-stage-for-high-latency-io — the named architectural pattern: insert a queueing stage immediately upstream of a high-latency I/O step, sized via Little's Law for in-flight concurrency, with order restoration after the slow stage; broker-internal so client API contracts are preserved; patterns/pipelined-produce-with-position-guarantee — the pre-existing per-connection pipelining technique disclosed in this post as Redpanda's "early-design" technique that the upload-queue retrofit composes on top of: release the next request after position is guaranteed but before completion, preserving Kafka idempotent-producer ordering via sequence- numbered handoffs). Extends 4 pages: systems/redpanda-cloud-topics (+full new "Little's Law write-path retrofit" section with verbatim quotes, the updated write-path diagram showing the upload concurrency-buffer queue

  • order-restoration step, and the candid retrospective lesson; +9 tags; +6 related links); concepts/queue-length-vs-wait-time (cross-link to the new concepts/littles-law canonical page + new Seen-in entry for the latency-hiding-queue altitude application of the wait-time-vs-length axis — Cloud Topics' upload queue is sized for T × W so steady-state queue length is expected, making wait-time the only signal that distinguishes "running as designed" from "saturation, applying backpressure"); concepts/batching-latency-tradeoff (new Seen-in entry contrasting batching as fixed-cost amortisation vs the upload queue as variable-latency hiding — orthogonal cost-axis responses both used in Cloud Topics); systems/redpanda (sources list extended). Canonical load-bearing disclosures: (1) First wiki application of Little's Law in its engineering form Throughput = Latency × Concurrency as the load-bearing architectural lens — verbatim "the latency of the upload phase could be up to 100x slower than the replication phase. This makes it easy to feel the implication of Little's Law, which equates to Throughput = Latency * Concurrency." (2) Per- connection throughput ceiling worked example: pre-Cloud-Topics "if replication takes, for example, 10ms, then the system can only process 100 requests per second per connection" — the 100 RPS-per-connection figure is the per-connection ceiling Little's Law sets when concurrency-per-connection is 1. (3) Pipelined- produce-with-position-guarantee disclosure as Redpanda's early- design technique: "after a request's position in the pipeline has been guaranteed, all of its dependencies have been resolved, and the next queued request can be processed before previous requests have been replicated. This allows pipelined processing of produce requests and was a significant improvement over the early design." (4) Two-stage Cloud Topics write storage reaffirmation: object-storage upload phase + metadata replication phase, with the upload queue inserted "during the upload phase of the write path, before requests are processed by the replication layer." (5) Order- preservation discipline: "Once the data is uploaded, we preserve the ordering from the producer and release it into the replication layer […] We still hold the producer acknowledgment until the metadata is fully replicated across the cluster" — Kafka idempotent-producer + acks=all contracts preserved by the post-queue order-restoration + producer-ack-on-metadata- durable discipline, so the broker absorbs the substrate change without client-side knob churn. (6) OMB-validated GB/s outcome: "Running OpenMessaging Benchmark with this change was the key to unlocking throughput, and allowed us to easily push through to the GB/s scale we were targeting without needing to change any producer configurations." No client-side configuration changes — the substrate change is fully broker-internal. (7) Candid retrospective on testing discipline: "we had been so focused on building functionality that we hadn't been focused on pushing real-world workloads through the system. Thanks to the entire Cloud Topics team, our carefully thought-out implementation readily accepted the fixes we needed, and what could have been a serious architectural oversight became an insightful process." This is a candid post-hoc disclosure that the initial Cloud Topics implementation described in the 2026-03-30 architecture post would have been throughput-pinned without the upload-queue retrofit, and only an OMB run discovered it before GA. Cross-source continuity: Direct sequel to the 2026-03-30 architecture deep-dive — the 2026-03-30 post described the substrate primitives, this post discloses the production-tuning fix that made them deliver target throughput. Together they constitute the first complete public mechanism description of how Cloud Topics handles the produce path. Sibling to the [[sources/2026-04-21-redpanda-me-and-my-shadow-link-disaster-recovery-replication-made-easy|2026-04-21 Shadow Linking deep-dive]] in tone — both are short, candid, performance-engineering retrospectives that disclose scale-test-driven fixes (Shadowing's 2.5 GiB/s validation; Cloud Topics' GB/s OMB run).

  • 2026-04-21 — Me and my shadow (link!): Disaster recovery replication made easyRedpanda unsigned mechanism + performance + reciprocal- architecture deep-dive (~2,500 words) on Shadow Linking — the feature the 25.3 launch post introduced at preview altitude, now walked at mechanism altitude with scale-test numbers, reciprocal-cluster architecture, failover granularity, and link-deletion safety. 5 new canonical wiki pages: 4 concepts + (concepts/parallel-broker-replication-tasks — each broker in the shadow cluster runs replication tasks reading directly from source brokers, inheriting Redpanda's shared-nothing runtime to produce linear throughput scaling with broker count; concepts/replication-lag-message-count — broker-native lag measurement in message count, from which wall-clock RPO is derived via RPO = lag / throughput; concepts/reciprocal-active-passive-clusters — both-clusters- source-and-shadow topology via two parallel shadow links, with each topic still single-writer by construction; concepts/per-topic-granularity-failover — DR primitive is failover(topic, link) and failover(link), matching app-level outage scope; concepts/topic-prefix-namespacing-conventiona_ / b_ prefix encoding origin cluster into topic+consumer-group names)

  • 2 patterns (patterns/reciprocal-active-passive-via-parallel-shadow-links — the two-parallel-shadow-links architecture with topic-prefix discipline, schema-registry primary-site asymmetry, and bidirectional failover; patterns/topic-level-granular-dr-failover — sub-link DR granularity matching app-level-outage scope, composing with patterns/always-be-failing-over-drill discipline to make small-cadence DR drills feasible). Extends 6 pages: systems/redpanda-shadowing (+9 new sections: Mechanism / Performance / Five replication axes / Failover granularity / Link-deletion safety / Reciprocal active-passive / Hardware cost vs MM2 / Observability surface / Failover runbook; +22 tags; +11 related links); concepts/offset-preserving-replication (scale-validation Seen-in entry — 2.5 GiB/s / 4 ms RPO); concepts/broker-internal-cross-cluster-replication (per- broker-task implementation mechanism + three-cluster-vs-two- cluster hardware cost disclosure); concepts/rpo-rto (new SLA-vs-measured-case section — the 25.3 SLA is "measured in a few seconds"; this post's scale-test gives ~4 ms average — two orders of magnitude better); concepts/mirrormaker2-async-replication (two new sections: Three-cluster hardware cost at 1 GiB/s worked comparison, Duplicate-message risk as the MM2 fidelity gap from consume-and-reproduce restart semantics); patterns/hot-standby-cluster-for-dr (new Seen-in entry canonicalising per-topic failover + reciprocal active-passive as refinements); patterns/offset-preserving-async-cross-region-replication (scale-validation Seen-in entry + two extension-pattern pointers); systems/redpanda (Shadowing bullet extended with 2026-04-21 mechanism-deep-dive pointer). Canonical load-bearing disclosures: (1) 2.5 GiB/s / 2.5 M msg/s / <10k msg total-cluster lag → ~4 ms effective RPO average — first per-feature Shadow Linking scale-test number, two orders of magnitude better than the 25.3 "few seconds" SLA; (2) three-cluster hardware cost for MM2 (source + sink + Kafka Connect cluster) vs Shadow Linking's two-cluster shape, made explicit at 1 GiB/s scale with the "no additional hardware is needed" framing; (3) per-topic failover primitivefailover(topic, link) matches app-level outage scope while failover(link) matches region-level, turning what was one DR tool into two; (4) reciprocal active-passive architecture via two parallel shadow links with a_ / b_ topic prefix convention encoding origin cluster into the name itself; (5) schema-registry primary-site asymmetry — the one constraint in the reciprocal topology because both sites would write to the same _schemas topic; (6) link-deletion safety invariant"You can only delete a shadow link once all of the flows are failed over and there are no active replication flows" — a guardrail against operator-error cleanup races; (7) source cluster is unaware of the link — config lives entirely on the shadow side, unilateral DR setup; (8) shadow topics are read-only to regular producers until failover — broker- enforced split-brain prevention; (9) MM2 duplicate-message fidelity gap — MM2 re-produces-after-restart can introduce duplicates at new destination offsets; broker-internal offset- preserving replication does not have this shape; (10) three- surface observability — Prometheus metrics, Redpanda Console GUI, rpk + REST APIs. Cross-source continuity: direct mechanism + performance sequel to the 25.3 launch post — the 25.3 is the what + why, this is the how + how fast + and also for active-active; paired reading. Tier-3 pass — substrate-mechanism content with genuine novel scale + architecture disclosures, not launch PR. Architecture density ~60-70% of the body; the closing CTA paragraph (demo / docs / Slack) is restrained (~one paragraph). Caveats: unsigned 2.5 GiB/s benchmark (single data point, no hardware/region/cluster-size disclosure); no walkthrough of the replication-task internals (scheduling model, per-task-per- partition mapping, write-into-log-layer mechanism all deferred to docs); reciprocal active-passive is not active-active multi-writer (each topic still has a single writer — the owning cluster); "Not all properties are replicated" for topic configs with excluded set not named in the body; schemas are off by default (footgun if forgotten at link creation). URL rule compliance: raw url: field is verbatim https://www.redpanda.com/blog/shadow-linking-disaster-recovery-replication-made-easy (filename slug truncated with hash suffix); source page url:
  • body ## Source section both use verbatim URL; raw-markdown link uses the actual hash-suffixed filename.

  • 2026-04-14 — Openclaw is not for enterprise scaleRedpanda unsigned rhetorical-voice governance essay (~1,200 words) arguing that Claude-Code-class local coding agents ("Openclaw" category stand-in) work for personal dev laptops but fail at enterprise scale because the sandbox doesn't solve the underlying credential-holding, audit, and egress-control problems. Opens with a HackerNews comment re-framing the sandbox-for-agents problem as "giving your dog a stack of important documents, then being worried he might eat them, so you put the dog in a crate, together with the documents" — a memorable framing the post carries through as its architectural thesis. Load-bearing canonicalisation: the closing formula Gateway + Audit trail + Token vault + Sandboxed compute = Agents in production as the minimum architectural bar for enterprise agent deployment. Each component solves a failure mode the others can't: (1) Gateway ( central proxy choke point) — single choke point for agent egress, observability, rate limits, kill switch — "turn it off for a single service or set of services for your entire digital workforce at once". (2) Audit log + transcripts"why and how, not just what", with "inputs, outputs, tool calls, token usage, and the agent's reasoning chain" captured; adds agentic performance review as a new use case for the durable event log audit envelope. (3) Token vault (new canonical concept) — out-of-band credential broker that mints short-lived scoped tokens per operation. The agent never holds the credentials; "Don't give the dog your keys." Canonical OBO substrate for user-auth-only systems (Salesforce, ServiceNow) — "You can't build a real multi-tenant agent without this." (4) Sandboxed compute with gateway-only egress (new canonical pattern) — sandboxes are "right" (LLMs need Unix composability for tool-output post-processing) provided egress is choke-pointed at the gateway and auth comes from out-of-band agent-identity metadata, not files inside the sandbox. Redpanda-specific mechanism: agi CLI (new canonical system)"agentic gateway interface", a dynamic self-describing CLI inside the sandbox that mediates agent→gateway calls while preserving Unix-workflow composability. "Yes, the name is a play on that AGI." First wiki mention; demonstration-altitude, not shipping product. Threat-model-at-scale argument: "If you're a developer running it on a dedicated machine with limited access and scope, the threat model is manageable [...] The problem shows up when organizations try to scale that model. When the IT team decides 'just run it in a VM' for each department. When someone decides the sandbox is sufficient governance for production use. It isn't." Canonicalises sandbox-adequate-for-personal-use-breaks-at-enterprise- scale as the structural argument for the four-component stack. 3 new canonical pages: concepts/token-vault + patterns/four-component-agent-production-stack + patterns/agent-sandbox-with-gateway-only-egress + systems/redpanda-agi-cli. Extends 6 pages: patterns/central-proxy-choke-point (kill-switch added as canonical choke-point capability; agent-workforce-scale instance added); patterns/agentic-access-control ("Don't give the dog your keys" framing + token-vault substrate reinforcement); patterns/on-behalf-of-agent-authorization (token-vault named as OBO substrate for user-auth-only systems); patterns/durable-event-log-as-agent-audit-envelope (transcripts + A/B agent evaluation as new use cases); concepts/audit-trail (transcripts + reasoning-chain as why-and-how audit shape); concepts/short-lived-credential-auth (per-operation minting canonicalised via token-vault). Tier-3 borderline include on pattern-crystallisation + new-system grounds — zero production numbers, zero mechanism depth on the four components, but crystallises prior governance patterns into a quotable architectural formula and introduces the agi CLI as a distinct system. Cross-source continuity: sequel to 2025-10-28 ADP launch + companion governance-framing post; safety-side companion to 2025-04-03 Gallego autonomy essay; sibling to 2026-02-10 Akidau talk-recap (four-component stack compresses six of Akidau's eight axes). Caveats: rhetorical-voice essay not architecture deep-dive; "Openclaw" is a product-family stand-in (not a real product, myclaw.ai is a rhetorical placeholder); token-vault protocol / software not named; agi CLI is a "demonstration", no repo / license / availability; kill-switch trigger UX not walked; sandbox escape + prompt injection explicitly out of scope.

  • 2026-04-09 — Oracle CDC now available in Redpanda ConnectRedpanda unsigned launch post (~900 words) announcing the oracledb_cdc input in Redpanda Connect v4.83.0 (enterprise-gated). Adds Oracle as the sixth source-database engine in Redpanda's per-engine CDC family (Postgres / MySQL / MongoDB / Spanner / MSSQL / Oracle). Four load-bearing architectural disclosures: (1) rides on Oracle LogMiner — the Oracle Enterprise Edition redo-log-mining utility — canonicalised as concepts/oracle-logminer-cdc, sibling to Postgres logical replication / MySQL binlog / MongoDB oplog / Spanner change streams / SQL Server change tables. No additional Oracle licensing required beyond Enterprise Edition. (2) In-source checkpointing"Restarts resume from a checkpointed position stored in Oracle itself, with no external cache required, no re-snapshot, and no gaps." Fourth canonical offset-durability class on the wiki alongside server-owned Postgres slots, consumer-owned external stores (MySQL / MongoDB), and transactional-row storage (Spanner). Oracle and Spanner both live inside the source DB but differ on atomicity with data — Spanner's progress commits transactionally with each row, Oracle's lives in a separate checkpoint table. (3) Precision-aware NUMBER mapping via Oracle's ALL_TAB_COLUMNS data- dictionary view — integers from NUMBER(p, 0)int64, decimals from NUMBER(p, s) with s > 0json.Number. Composed with schema_registry_encode for typed Avro encoding in Schema Registry. Automatic mid-stream schema-drift detection: new columns detected automatically; dropped columns reflected after connector restart. Canonical seventh schema- evolution axis on the wiki (contrast the 2026-03-05 Iceberg- output registry-less axis — this one is registry-with-data- dictionary-as-source-of-truth). (4) Oracle Wallet auth — canonical first wiki instance of file-based credential store. Two wallet formats: cwallet.sso (auto-login, no password) and ewallet.p12 (PKCS#12, password via wallet_password config field which is redacted from logs and config dumps). SSL enabled automatically. Second canonical instance of Bloblang-interpolated multi-table routing — now at the CDC-source-to-topic-per-table position (first instance was the 2026-03-05 Iceberg-output sink-side). topic: ${! meta("table_name").lowercase() }. Competitive framing against Debezium on Kafka Connect verbatim: "No JVM, no Kafka Connect cluster, no separate workers. Just Redpanda Connect doing what it does best." 8 canonical new pages: source + 4 systems (redpanda-connect-oracle-cdc, oracle-database, oracle-logminer, oracle-wallet) + 4 concepts (oracle-logminer-cdc, in-source- cdc-checkpointing, precision-aware-type-mapping, file-based- credential-store). Extends 7 pages: concepts/change-data-capture (sixth engine + fourth offset- durability class), concepts/external-offset-store (fourth row added in comparison table), concepts/schema-evolution (seventh axis), patterns/cdc-driver-ecosystem (ecosystem now six engines), patterns/bloblang-interpolated-multi-table-routing (second instance at CDC-source position), systems/redpanda-connect (new Oracle CDC section + Seen-in), systems/debezium (named competitive foil). Tier-3 borderline include on vocabulary-canonicalisation grounds — fills gaps the prior five-engine CDC ingests left open. Zero production numbers (no throughput / latency / snapshot-duration figures; contrast 2025-11-06 MSSQL launch which disclosed ~40 MB/s vs ~14.5 MB/s). Undisclosed: LogMiner operational caveats (supplemental logging, archive-log rate, continuous-mining deprecation, primary-overhead); snapshot-boundary SCN mechanism; checkpoint- table name/schema/write-cadence; Oracle topology scope (RAC, Data Guard, Multitenant, Standard Edition); parallel-snapshot- of-large-table claim absent (vs 2025-03-18 Postgres + MongoDB differentiator); LOB / LONG / XMLTYPE / JSON-column handling; UPDATE/DELETE before-after-image semantics. Cross-source continuity: sixth-engine extension of the 2025-03-18 CDC connectors post + 2025-11-06 25.3 MSSQL launch; auth/compliance-substrate companion to the 2025-05-20 FIPS post and 2026-03-05 Iceberg-output OAuth2 canonicalisation.

  • 2026-04-02 — Supercharging Redpanda Streaming with profile-guided optimizationRedpanda engineering deep-dive (unsigned). Mechanism-level companion to the 2026-03-31 Redpanda 26.1 launch post's one-line PGO disclosure ("Profile-Guided Optimization (PGO) delivers 10-15% efficiency improvement on small message batches"). Unpacks the clang PGO two-phase compilation and LLVM BOLT post-link alternative (systems/llvm-bolt), framed by top-down microarchitecture analysis (TMA) via Linux perf stat --topdown. Measured wins on the canonical small-batch regression benchmark: ~50% p50 latency, up to 47% p999 latency, 15% CPU reactor utilisation reduction. TMA data verbatim: baseline 51% frontend-bound ("definitely on the higher end, even for database or distributed applications") reduced to 37.9% after PGO — with 6 percentage points moving to retiring (useful work) and 7 to backend-bound ("resolving one bottleneck often reveals the next"). PGO mechanisms: hot-cold splitting + basic-block reordering + profile- driven inlining, all targeting [[concepts/instruction-cache- locality|i-cache locality]]. BOLT heatmap visualisation confirms the hot-code-packed layout ("all hot functions are packed tightly at the start of the binary"). PGO vs BOLT trade-off: Redpanda evaluated both and chose PGO citing stability — "PGO is a proven and widely deployed technology ... outstanding BOLT bugs, we decided to stick with PGO." Disclosed LLVM bug llvm-project#169899 as the decisive datum — first wiki-canonical non-Meta BOLT brittleness disclosure, contrasting Meta's fleet-scale success via Strobelight → BOLT + CSSPGO. BOLT performance "similar to PGO. Most of the time, it came in just slightly behind"; combining both adds "another small bump in performance". Substrate: feedback-directed optimisation (FDO) family — canonicalised as umbrella. Instrumented vs sampling profile trade-off canonicalised. Composes with batching-under-saturation to explain the 15%-CPU → 47%-p999 amplification. Tier-3 on-scope decisively — unusual for Redpanda's launch-/ marketing-heavy Tier-3 corpus; genuine engineering deep-dive with microarchitecture rigor, hardware-counter before/after data, and an explicit PGO-vs-BOLT trade-off analysis that discloses a concrete LLVM bug. Cross-source continuity: mechanism-level companion to the 2026-03-31 26.1 launch post's one-line PGO bullet; extends BOLT coverage from Meta's fleet success (2025-03-07 Strobelight post) to the non-Meta brittleness perspective; sibling to patterns/measurement-driven-micro-optimization at the C++ binary-layout altitude (JVM / JDK-Vector-API sibling at the Java-vectorisation altitude). 9 new canonical pages (source + 6 concepts [PGO, LLVM BOLT post-link optimiser, TMA, frontend- vs-backend-bound, hot-cold splitting, instrumented vs sampling profile, i-cache locality, feedback-directed optimisation] + 3 patterns [PGO for frontend-bound application, TMA-guided target selection, feedback-directed optimisation fleet pipeline] + 2 systems [LLVM BOLT, Clang]) + 2 extensions (systems/meta-bolt-binary-optimizer with non-Meta brittleness disclosure; systems/redpanda with new 26.1 PGO section).

  • 2026-03-31 — Redpanda 26.1 delivers the industry's first adaptable streaming engineRedpanda unsigned product-launch post for Redpanda 26.1 / Redpanda One (R1). Headline: General Availability of Cloud Topics (previewed beta in 25.3). Framed as "the industry's first adaptable streaming engine: a single, multi-modal platform that lets you toggle between single-digit-millisecond performance and massive-scale efficiency at the topic level." Load-bearing canonicalisations (7 new pages): (1) pass-through write to object store — the architectural primitive behind Cloud Topics (created 2026-04-24; source cited here). (2) Group-Based Access Control (GBAC) — IdP-group-mapped authorization replacing per-user ACL micromanagement. "Instead of micromanaging permissions for every single 'Bob' and 'Alice'… you can now map roles or permissions to groups provided centrally by an Identity Provider (IdP)." (3) Ranked rack preferences for Leader Pinning — turns leader-pinning from single- location-hint into ordered-fallback-list, "deterministically list which regions and AZs should host your partition leaders. It turns leader placement from a game of chance into a strategic advantage." Extends leader pinning (canonicalised in 2025-02-11 stretch-cluster post). (4) Schema Registry contexts"namespace your schemas, making it easy to isolate environments, perform complex migrations, and manage multi- team registries." (5) Ghost- node ejection — automatic cleanup of stale cluster-state references to nodes that have left, cluster-membership-altitude analogue of concepts/explicit-teardown-on-completion. (6) Diskless / disk-lite hybrid streaming — Redpanda's coined architectural vocabulary explicitly positioned against WarpStream-class fully-diskless shapes: "diskless isn't riskless. By moving everything off of local disks in favor of object storage and external databases, these systems sacrifice data integrity, availability, and core Kafka features (like transactions) on the altar of cost." (7) IdP-group-mapped authorization + Cross-region read via object storage — the new AWS-only Cross-Region Remote Read Replicas extending Remote Read Replica cross-region. "By accessing topics through S3 instead of the network, you can serve reads in AWS regions worldwide without putting any workload on the production cluster, at a much lower cost than using a 'stretch' cluster." Key cost-figure disclosure: "You get over 90% lower networking costs and the economics of a diskless system, while keeping the battle-hardened reliability of a local-disk broker… No broken transactions. No metadata lag. No external control plane dependencies. Just efficient, Raft-consistent streaming." The 90% figure extends the unquantified 25.3 "virtually eliminates" framing to a vendor-disclosed percentage. Global data plane shape — the post pairs ranked rack preferences (write-path) with Cross-Region RRR (read-path) as the stretch-cluster-and-MM2 replacement for global-data-plane shaping: "Your data travels for business, not pleasure. Stop letting it run up expenses on the scenic route and fly direct." Release-note roundup (five additional features): Custom schema metadata"attach arbitrary metadata properties to your schemas, turning Redpanda Schema Registry into a first-class citizen in your data governance and observability stack" (extends concepts/schema-registry from version-tracking to governance substrate); JSON improvements for Iceberg Topics"JSON sub-schemas, nullable fields, and other enhancements for translating complex JSON structures into a clean bronze layer in Apache Iceberg" (third round of Iceberg Topics enhancements after 25.1 GA + 25.3 BigLake); Profile-Guided Optimization at 10-15% small-batch efficiency (pre- canonicalised via 2026-04-02 PGO deep-dive); FIPS 140-3 compliance (extends [[concepts/fips-140-validated-cryptographic-module|140-2 coverage]] from 2025-05-20 FIPS post); automatic ghost-node ejection (ingress listed above). Confluent positioning (repeated from 25.3): "Contrast this with Confluent, where you may need a mix of Kora-powered Confluent Cloud clusters (standard/dedicated or Freight) and the separate Confluent WarpStream engine (BYOC) to satisfy different requirements." "One engine, one API, and zero unnecessary clusters. R1, ftw!" Tier-3 borderline include on vocabulary-canonicalisation grounds. Launch-voice throughout ("One engine. Every workload. No more sprawl.", "Sacrificing brains for budget", "security for grown-ups"), but six primitives missing from prior wiki coverage + the Cloud Topics GA graduation + concrete 90% cost figure + the disk-lite-vs- diskless differentiation vocabulary passes the test. Would fail in isolation; passes because it rides on already- canonicalised 25.3 Cloud Topics pair + 25.2 leader-pinning corpus + 25.2 FIPS post + 25.1 Iceberg Topics GA. Cross-source continuity: direct sequel to 25.3 preview (beta → GA, plus 90% figure + disk-lite vocabulary + WarpStream-displacement framing); direct architectural sibling to the 2026-03-30 Cloud Topics architecture deep-dive published the day before with the mechanism detail this post compresses into marketing framing; extension of 2025-02-11 stretch clusters (leader pinning → ranked rack preferences; RRR → Cross-Region RRR); extension of 2025-05-20 FIPS (140-2 → 140-3). Caveats: no production numbers beyond the 90% vendor claim; zero mechanism depth on GBAC, ranked rack preferences, Schema Registry contexts, ghost-node ejection (all one-sentence PR framings); Cross-Region RRR is AWS-only at launch; Redpanda One / R1 branding introduced without elaboration on marketing-bundle-vs-release-number relationship. URL compliance: raw url https://www.redpanda.com/blog/26-1-r1-cloud-topics (verbatim from raw file; filename slug redpanda-261-delivers-the-industrys-first-adaptable-streamin is hash-suffix-truncated, not authoritative).

  • 2026-03-30 — Under the hood: Redpanda Cloud Topics architecture — architecture deep-dive on Cloud Topics following its GA in Redpanda Streaming 26.1. First detailed public description of the five primitives that make Cloud Topics work: a Cloud Topics Subsystem that batches in-memory across all partitions/topics ("e.g., 0.25 seconds or 4 MB"), an L0 file uploaded as a single PUT to S3/GCS/ADLS, a placeholder batch replicated via Raft to each involved partition's log carrying only the object-storage pointer, a background Reconciler that rewrites L0 files into L1 files (per-partition, offset-sorted, much larger), and a per-partition Last Reconciled Offset watermark routing reads between L0 and L1. Three new canonical concept pages: concepts/placeholder-batch-metadata-in-raft, concepts/l0-l1-file-compaction-for-object-store-streaming, concepts/last-reconciled-offset. Two new canonical pattern pages: patterns/object-store-batched-write-with-raft-metadata, patterns/background-reconciler-for-read-path-optimization. Architectural canonicalisation: the log-as-truth framing, previously applied at agent-interaction altitude (2025-10-28), is now instantiated inside the broker's own storage architecture — the Raft log of pointers is truth, S3 bytes are addressable cache. Caveats: no absolute latency numbers, no net-cost quantification (eliminated cross-AZ cost replaced by PUT cost + Reconciler egress), no Reconciler placement disclosure, no failure-mode discussion, no cache architecture.

  • 2026-03-05 — Introducing Iceberg output for Redpanda ConnectRedpanda unsigned launch post (~1,000 words) announcing the iceberg output connector for Redpanda Connect shipped in v4.80.0 (enterprise-gated). A declarative sink that writes streaming data to Apache Iceberg tables from a YAML pipeline via the Iceberg REST Catalog API. Positioned as the non-Kafka-source companion to the pre-existing broker-native Iceberg Topics feature — fills the gap for HTTP webhooks, Postgres CDC, GCP Pub/Sub, and other non-Kafka sources that need in-stream transformation (PII stripping, flattening, type routing) before landing in the lakehouse. Three architectural canonicalisations: (1) concepts/registry-less-schema-evolution — infers table schema from raw JSON; no Schema Registry required; verbatim "best of both worlds" between chained SMT brittleness and all-string dirty-data tables. Adds sixth axis to concepts/schema-evolution. (2) concepts/data-driven-flushing — flush only when data is present; inverts Kafka-Connect-era timer-driven default. Mitigates the concepts/small-file-problem-on-object-storage and quiet-source compute waste. (3) patterns/bloblang-interpolated-multi-table-routingtable and namespace config fields support Bloblang interpolation ('events_${!this.event_type}'). One pipeline → N tables. Canonical inversion of "configuration hell". Plus one new architectural pattern: patterns/sink-connector-as-complement-to-broker-native-integration — explicit two-shape comparison table against Iceberg Topics ("Zero-ETL convenience vs Integration flexibility") — the two paths are complementary, not competing. REST-catalog integration matrix: Polaris, systems/aws-glue, systems/unity-catalog, systems/google-biglake, Snowflake Open Catalog. OAuth2 token exchange + per-tenant REST catalog isolation at 0.1 vCPU per-pipeline density. Scope limits (v4.80.0): append-only only (upserts on roadmap — material for CDC UPDATE/DELETE); schema-inference mechanism depth undisclosed; no benchmarks; enterprise-gated license. Tier-3 borderline include as lean ingest on vocabulary-canonicalisation grounds — 4 new concepts (registry-less-schema-evolution, data-driven-flushing, small-file-problem-on-object-storage, bloblang) + 2 new patterns (bloblang-interpolated-multi-table-routing, sink-connector-as-complement-to-broker-native-integration) + 2 new systems (redpanda-connect-iceberg-output, apache-polaris stub) fill definitional gaps. 8 canonical new pages: source + 2 systems + 4 concepts + 2 patterns. Extends 7 pages: systems/redpanda-connect (new Iceberg output section), systems/redpanda-iceberg-topics (new sink-connector-complement Seen-in entry), concepts/schema-evolution (sixth axis entry), concepts/iceberg-catalog-rest-sync (REST catalog as sink-connector integration surface Seen-in entry), patterns/streaming-broker-as-lakehouse-bronze-sink (sink-connector-altitude variant Seen-in entry), patterns/broker-native-iceberg-catalog-registration (sink-connector counterpart Seen-in entry), companies/redpanda (this page). No existing-claim contradictions — strictly additive.

  • 2026-02-10 — How to safely deploy agentic AI in the enterpriseTyler Akidau talk-recap (Redpanda CTO, originator of Google Dataflow / Apache Beam) from Dragonfly's Modern Data Infrastructure Summit. Marketing-adjacent reprise of the 2025-10-28 ADP launch framing, ~3.5 months later, aimed at lay enterprise-architect audience. Two load-bearing canonicalisations: (1) D&D alignment framing — human workers hired into lawful-good quadrant; AI agents default to the chaotic column ("at best 'chaotic good' — because you don't know what you don't know"); governance + auditing infrastructure is the mechanism that moves agents leftward toward lawful. (2) Eight-axis enterprise-agent-infrastructure checklist — context building

  • maintenance / context querying / authentication / governance / auditing / replay and validation / routing / multi-agent coordination. Akidau's load-bearing claim: six of eight are streaming problems (context querying + authentication stay outside streaming's remit). Two new canonical patterns: patterns/dynamic-routing-llm-selective-use (use AI where it wins, route to cheaper ML/heuristics otherwise — fraud-detection worked example: ML scans ~99% normal traffic, LLM investigates the ~1% flagged cases) + patterns/multi-agent-streaming-coordination (streaming broker as decoupled coordination substrate for multi-agent systems; inherits decoupled-services + durability + fan-in + fan-out from microservices-over-Kafka lineage). Agent-anatomy-=-streaming-platform-anatomy framing extends concepts/streaming-as-agile-data-platform-backbone to the agent-substrate altitude. Metadata-only-audit-insufficient framing extends patterns/durable-event-log-as-agent-audit-envelope — classical systems audit logs byte-count + timestamp metadata, but agents require full-input + full-output capture to make inferences. Closing honest caveat: "streaming can help solve a lot of agentic AI challenges, it's not your answer for everything. You still need authN/authZ, a multi-modal catalog of contextual data (not just streaming data), querying, and a durable execution for workflows". Tier-3 borderline include on rhetorical- framing + eight-axis-enumeration + two-new-patterns grounds. 5 canonical new pages: source + 2 concepts + 2 patterns. Extends 8 pages: concepts/autonomy-enterprise-agents + concepts/streaming-as-agile-data-platform-backbone + concepts/governed-agent-data-access + patterns/durable-event-log-as-agent-audit-envelope + patterns/cdc-fanout-single-stream-to-many-consumers + patterns/snapshot-replay-agent-evaluation + concepts/audit-trail + systems/redpanda-agentic-data-plane. Cross-source continuity: talk-recap companion to the 2025-10-28 ADP launch pair ( Gallego productization + governance-pattern naming); sibling to 2025-06-24 streaming-backbone essay (data-substrate half; this Akidau post extends to agent-substrate half); risk-side dual of Gallego autonomy essay (capability side). First wiki footprint for Akidau as a Redpanda-era talk speaker (prior Akidau work on the wiki is via Dataflow / Beam / MillWheel streaming-model primitives).
  • 2026-01-27 — Engineering Den: Query manager implementation demoFirst post in Redpanda's new Engineering Den series; ~600- word post-acquisition disclosure from the Oxla team on their rewrite of the query manager"the component responsible for the lifecycle of currently-running queries". Old manager suffered from ambiguous state (queries stuck in finished or executing while still holding resources; different parts of the system disagreed about what was happening) and a pathological cancellation path — canonicalised verbatim as async- cancellation-thread-spawn anti-pattern: "To avoid deadlocks, the old code gathered running queries, spawned async work per thread, and sometimes had to retry cancellation from a different thread entirely." Rebuilt as a deterministic state machine with every transition logged and explicit teardown at terminal states. Verbatim core claim: "The new scheduler is built as a deterministic state machine. At any point, it's in a known state, handling a specific event, and transitioning predictably. Every transition is logged." Composed pattern canonicalised as patterns/state-machine-as-query-lifecycle-manager. Tested on ~25,000 queries across 1- and 3-node clusters without reproducing the prior pathologies; no throughput / latency numbers — reliability-first validation frame. Debuggability payoff verbatim: "Bugs still happened ... but they were much easier to track down. Being able to trace state transitions made fixes straightforward instead of exploratory." — issues "fixed in days instead of weeks". Production rollout "within days" of post. 6 new canonical wiki pages: source + 5 concepts (concepts/deterministic-state-machine-for-lifecycle, concepts/state-transition-logging, concepts/query-lifecycle-manager, concepts/async-cancellation-thread-spawn-antipattern, concepts/explicit-teardown-on-completion) + 1 pattern (patterns/state-machine-as-query-lifecycle-manager). Extends systems/oxla with first post-acquisition mechanism disclosure (prior Oxla canonicalisation was acquisition-framing from 2025-10-28 ADP launch). Tier-3 borderline include on first-post-acquisition-Oxla-internals-disclosure grounds + reliability-doctrine-canonicalisation grounds — short engineering-diary voice with no state diagram, no code snippets, no benchmark depth. Caveats: "deterministic" claimed not shown (no TLA+ / model-check); cancellation protocol not fully detailed; 25K-query sample is modest (1- and 3-node clusters only); failure-modes of the new manager not enumerated; series kickoff promises more depth in future Den posts. No existing-claim contradictions — strictly additive on Oxla's wiki page. First canonical wiki use of the "state machine as lifecycle manager" pattern at query-engine altitude; related-but-distinct instances already exist at consensus-request altitude (concepts/two-phase-completion-protocol) and workflow altitude (concepts/fault-tolerant-long-running-workflow).
  • 2026-01-13 — The convergence of AI and data streaming, Part 1: The coming brick wallsPeter Corless industry-commentary post (~2,100 words) adapted from the author's AI-by-the-Bay talk. Part 1 of a four-part series; promises Parts 2-4 on adaptive LLM strategies, AI observability/evaluation, and real-time streaming + AI respectively. Names three "brick walls" for frontier AI: (1) ethically-sourced public training data exhaustion (Epoch AI 2024 S-curve thesis; petabyte ceiling vs zettabyte-scale global data production; 180 ZB generated / 200 ZB stored in 2025, CAGR 78%; Yottabyte Era projected 2028-2030); (2) training-cost growth (~260% annually, projected >$1B per frontier model by 2027 per Epoch AI, data- centre energy 2× by 2030 per Nature April 2025); (3) batch-training boundary ("regardless of their dense or MoE architectures, they're still all batch trained"). MoE vs Dense frontier-LLM landscape with concrete parameter- count disclosures: GPT-4 = 8 × 220B (George Hotz 2023 leak), Gemini MoE since 1.5 (Feb 2024), Grok MoE since Grok-1, Anthropic Claude = Dense Transformer holdout. GPT-1 → GPT-5 scaling curve: 117M → ~50T parameters = 5 orders of magnitude in 8 years; GPT-5 400K-token context window. Brick-wall-companion observations: embedding-dimension diminishing returns past 1,536 dims (cites Supabase pgvector); model drift over time verbatim "each answer is a special snowflake, and those snowflakes can melt over time" — cites arXiv 2307.09009 + GPT-5.1 < GPT-5.0 regression on some evals; RLHF as offline batch fine-tuning pipeline (cites arXiv 2307.15217). Names RAG
  • MCP as the two inference-time real-time-data access mechanisms that do not cross the batch-training boundary. Frames the data scientists vs data engineers organisational silo (cites Jesse Anderson's Data Teams) as the socio-technical pre-requisite to architectural convergence. Running gag: the "d20 test" image-generation prompt as a hallucination-failure-mode evaluation opener — only Gemini 3.0 Thinking passed (inconsistently); ChatGPT 5.x, Midjourney, Meta AI, Grok, Claude, Google Veo all fail. Also cites $1.5T global AI spend in 2025. 7 new canonical wiki pages (source + 6 concepts: concepts/frontier-model-batch-training-boundary, concepts/llm-training-data-exhaustion, concepts/llm-model-drift, concepts/dense-transformer, concepts/rlhf-offline-batch, concepts/s-curve-limits, concepts/embedding-dimension-diminishing-returns, concepts/retrieval-augmented-generation). Extends concepts/mixture-of-experts (new Frontier-LLM MoE landscape section with GPT-4 8×220B + Gemini + Grok + Claude disclosures), concepts/llm-hallucination (new Seen-in for d20-test framing + hallucination-orthogonal-to-scaling claim), systems/transformer (new Seen-in with 117M→50T scaling curve + 400K-token context + MoE/Dense variant landscape), plus this page. Tier-3 borderline include on vocabulary-canonicalisation grounds — industry-commentary voice, no production numbers from shipping Redpanda system, streaming-specific payoff explicitly deferred to Parts 2-4. Passes on the frontier-LLM vocabulary (batch-training boundary + data exhaustion + MoE landscape + model drift + RLHF-as-batch) being genuinely missing from prior wiki coverage; canonicalises framing the wiki will compose subsequent ingests (Parts 2-4) against. Companion to 2025-06-24 streaming-backbone essay — the data-substrate framing; this post is the why frontier models need it framing at industry-altitude. Companion to Gallego 2025-04-03 autonomy essay and the 2025-10-28 ADP launch as the agent-substrate framing. Caveats: Hearsay primary sources (Hotz-leak GPT-4 numbers, "estimated" GPT-5 parameter counts); Epoch AI projections are interpretive; embedding-dimension ceiling single-sourced to a Supabase post; arXiv 2307.09009 drift magnitude is contested in the literature; private-data ethics transition narrated not structurally analysed; MoE landscape omits Mixtral / DeepSeek / Qwen / Llama-MoE; RLHF mechanism not walked; d20 test is a conversation-opener gag not a rigorous eval. Series Parts 2-4 deferred.

  • 2025-12-09 — Streaming IoT and event data into Snowflake and ClickHouseUnsigned vendor-tutorial post (~2,400 words) framing a reference IoT pipeline: Redpanda Redpanda Connectboth Snowflake (short-term real-time) and ClickHouse (long-term columnar archive) simultaneously. Marketing voice with heavy Redpanda product-promotion + how-to config examples, but substantial canonical architectural core on the ClickHouse MergeTree + Snowflake Snowpipe Streaming substrate. Canonical new wiki pages (9): source + 7 concepts ( time-partitioned MergeTree + native TTL policies + DETACH PARTITION archival + granule- level min-value skip + append-only tamper resistance + Snowflake MATCH_RECOGNIZE sessionization + hot-cold per-column codec split) + 2 patterns (patterns/time-partitioned-mergetree-for-time-series

  • patterns/clickhouse-plus-snowflake-dual-storage-tier). Inverted storage-tier framing for compliance-sensitive workloads: Snowflake for streaming access logs + financial triggers (governance matters), ClickHouse for long-term compressed retention (compression wins). Canonical MergeTree schema example (telemetry_events with PARTITION BY toYYYYMM(timestamp) + TTL INTERVAL 12 MONTH DELETE + CODEC(ZSTD) on value column). Specific Snowpipe Streaming batching recommendations (500–1,000 records low-latency, 10,000+ bulk, 1,000-at-most for time-series; byte_size: 0; period 10–30 s for real-time dashboards vs 1–5 min for less frequent). schema_evolution off-as-performance-optimisation framing inverts the default "always turn on" recommendation — canonicalised on concepts/schema-evolution as the fifth evolution axis. MATCH_RECOGNIZE worked example for ≤ 10-second same-IP click sessionization. Redpanda Connect gap disclosure: no dedicated ClickHouse output connector; use generic sql_raw / sql_insert processors — contrasts with first-class snowflake_streaming. Broker vs multiplexing named as the two fan-out primitives for the dual-tier pattern. 9 new canonical pages + 8 extensions. Tier-3 borderline include on architectural- density grounds (mergetree internals + codec tiering + MATCH_RECOGNIZE are load-bearing despite marketing voice). Companion to 2025-10-02 Snowpipe Streaming benchmark — that post canonicalised the benchmark; this post canonicalises the batch-tuning guidance and the dual-tier architecture that composes it with ClickHouse.

  • 2025-12-02 — Operationalize Redpanda Connect with GitOpsTutorial-voice unsigned post (~2,000 words) canonicalising the end-to-end Argo CD + Helm + Kustomize deployment shape for Redpanda Connect on Kubernetes. Walks through both deployment modes side by side: Standalone (single pipeline, config baked into Helm values, deployed via Argo CD multi-source Application with chart from charts.redpanda.com pinned at targetRevision: 3.1.0 + values from customer's repo) + Streams (multiple pipelines from Kubernetes ConfigMaps, deployed via Kustomize wrapping the Helm chartconfigMapGenerator for hashed ConfigMap names + helmCharts for chart inflation; kustomize.buildOptions: --enable-helm --load-restrictor LoadRestrictionsNone as Argo CD precondition). Streams-mode REST API (/version, /ready, /streams, /metrics) canonicalises the runtime-API vs GitOps source-of-truth anti-pattern — GitOps-compatible "as long as it's used by automation that derives its desired state from Git", anti-pattern "only when humans or external systems modify pipelines through the API without updating Git." Every production operation expressed as a Git commit: scaling (replicaCount: 1 → 3), adding pipelines (new files in config/), updating pipelines (edit YAML → Kustomize produces new hash → rolling restart via ConfigMap hash rollout), decommissioning (scale to zero or argocd app delete). Observability deployed as parallel Argo CD Application — kube-prometheus-stack (Prometheus + Alertmanager + Grafana + K8s dashboards) + Prometheus service monitor + Redpanda Connect Grafana dashboard — Redpanda Connect exposes Prometheus-compatible metrics natively "without custom exporters or sidecars". Closing product-roadmap laundry list (automatic linting + policy / compliance checks + developer portal + external secrets + template catalog + resource limits + multi-cluster) signals what Redpanda believes a mature Redpanda-Connect GitOps platform needs. Companion GitHub repo: redpanda-data-blog/redpanda-connect-the-gitops-way. 4 canonical new wiki pages: 3 concepts (concepts/standalone-vs-streams-mode, concepts/configmap-hash-rollout, concepts/runtime-api-vs-gitops-source-of-truth) + 2 patterns (patterns/argocd-multi-source-helm-plus-values, patterns/kustomize-wraps-helm-chart) + 1 system (systems/kustomize). Extends 7 pages: systems/redpanda-connect (new GitOps deployment section + frontmatter + Seen-in + Related), systems/argocd (multi-source + Helm+Kustomize + runtime-API-tension sections), concepts/gitops (canonical application-tier Seen-in), systems/helm (Kustomize-composition canonical Seen-in), systems/kubernetes + systems/prometheus + systems/grafana (frontmatter sources). Tier-3 borderline include on vocabulary-canonicalisation grounds — tutorial-voice pedagogy with architecture density ~30-40% concentrated in the standalone/streams comparison table + Argo CD multi-source Application spec + Kustomize-wraps-Helm with --enable-helm precondition + Streams-mode REST API anti-pattern framing. Zero production numbers (no fleet sizes, no latencies, no customer references), no operator-path comparison (the 2025-05-06 K8s guide covers that), no mention of Topic / User CRDs for GitOps-compatible topic provisioning beyond name-check, no external-secrets-manager integration demonstrated. Canonical wiki counterpart to the 2025-05-06 A guide to Redpanda on Kubernetes (Operator path) — this post is the Helm + Argo CD path.

  • 2026-01-06 — Build a real-time lakehouse architecture with Redpanda and DatabricksTech-talk recap post (unsigned, ~1,100 words) summarising the joint Redpanda + Databricks tech talk "From Stream to Table" with speakers Matt Schumpert (Redpanda) + Jason Reed (Databricks, formerly on Netflix's data team). Walks the historical arc Hadoop-era data lakes → governance sprawl → Iceberg (Netflix-originated) → file-based-catalog era → REST catalog standardisation → Redpanda Iceberg Topics → Unity Catalog governance hub. Two load-bearing slogans canonicalise wiki-already-covered primitives at joint-vendor altitude: Schumpert — "The goal of this partnership is to remove the artificial line between real-time data and analytical data."
  • Redpanda unsigned — "the stream is the table" / "Streaming data is analytics-ready by default." Jason Reed supplies the Netflix-origin disclosure + consumer-side corroboration "The data shows up already structured, already governed, and already queryable." Three-system labour division verbatim: "Redpanda delivers real-time performance and reliability at scale. Iceberg provides an open, transactional table format optimized for analytics. Unity Catalog adds governance, optimization, federation, and lifecycle management across the entire system." Unity- Catalog-specific integration disclosure verbatim (Redpanda registers tables, manages schema updates, deletes tables, handles full lifecycle). Zero net-new concepts / patterns / systems — every primitive named is already canonicalised on the wiki (Iceberg + REST catalog + Iceberg topic + Unity Catalog + Bronze-sink pattern + broker-native-catalog- registration pattern all pre-exist). Value is at the joint-vendor-framing + historical-arc + Netflix-origin- disclosure altitudes. 0 new pages, 10 extensions (6 Seen-in additions + 4 frontmatter sources). Tier-3 borderline include on historical-framing + Netflix-origin + joint-vendor grounds; architecture content ~50% of body; zero production numbers.

  • 2025-11-06 — Redpanda 25.3 delivers near-instant disaster recovery and moreRedpanda 25.3 release preview post covering four headline features across three architectural axes. Four load-bearing canonicalisations the wiki had previously gapped: (1) Shadowing"a fully functional, hot-standby clone of your entire Redpanda cluster — topics, configs, consumer group offsets, ACLs, schemas — the works!" — architecturally distinct from both MirrorMaker2 and the prior Redpanda Migrator ("No MirrorMaker 2 or Redpanda Migrator connectors are used under the hood"). Three structural properties: broker- internal (not Kafka Connect-based); offset-preserving (byte-for-byte, with source-identical offsets — removes MM2's offset-translation-map client-failover cost); asynchronous. RPO/RTO in seconds ("limited only by timeout settings for producers and consumers"). Canonical pattern: patterns/offset-preserving-async-cross-region-replication (composed with hot-standby cluster for DR). (2) Cloud Topics (beta) — per-topic storage-substrate choice within a single cluster: record data goes "straight through and written to cost-effective object storage (S3/ADLS/GCS) while topic metadata is managed in-broker — replicated via Raft for high availability". "Virtually eliminates the cross-AZ network traffic associated with data replication" — the feature's load-bearing cost claim, canonicalised as concepts/cross-az-replication-bandwidth-cost. Motivated by the latency-critical vs latency-tolerant workload distinction (payments/trading/cybersecurity vs observability/compliance/model-training). Positioned against Confluent's "Kora-powered … standard/dedicated … Freight … plus separate Confluent WarpStream engine (BYOC)" multi- cluster shape. Canonical pattern: patterns/per-topic-storage-tier-within-one-cluster. (3) Iceberg Topics + Google BigLake metastore — Redpanda 25.3 adds GCP's managed lakehouse catalog to the REST catalog sync axis, completing the set with Unity Catalog / Snowflake Open Catalog (Polaris) / AWS Glue / BigLake. BigQuery now discovers streaming-produced Iceberg tables without CREATE EXTERNAL TABLE DDL; Dataplex provides governance. Complements the prior file-based-catalog shape from the 2025-05-13 BYOC beta post. (4) MSSQL CDC for Redpanda Connect microsoft_sql_server_cdc extends the Redpanda Connect CDC family to five source-database engines (Postgres / MySQL / MongoDB / Spanner / SQL Server). Rides on MSSQL's native change tables. Available in Redpanda Connect 4.67.5 (enterprise). Vendor benchmark: ~40 MB/s ingest + 3:15 initial snapshot on a 5M-row table vs ~14.5 MB/s / 8:04 for an unnamed alternative. Fits CDC driver ecosystem framing. 11 new canonical wiki pages: source + 5 systems (systems/redpanda-shadowing, systems/redpanda-cloud-topics, systems/redpanda-connect-mssql-cdc, systems/microsoft-sql-server, systems/google-biglake)

  • 4 concepts (concepts/offset-preserving-replication, concepts/broker-internal-cross-cluster-replication, concepts/cross-az-replication-bandwidth-cost, concepts/latency-critical-vs-latency-tolerant-workload)
  • 3 patterns (patterns/offset-preserving-async-cross-region-replication, patterns/hot-standby-cluster-for-dr, patterns/per-topic-storage-tier-within-one-cluster). Extends 9 pages: concepts/mirrormaker2-async-replication (new Shadowing-displacement section + Seen-in), concepts/rpo-rto (new seconds-RPO streaming shape section
  • Seen-in), concepts/change-data-capture (MSSQL fifth- engine Seen-in), concepts/iceberg-catalog-rest-sync (BigLake as fourth managed REST catalog), patterns/cdc-driver-ecosystem (MSSQL extension), patterns/tiered-storage-to-object-store (Cloud Topics as per-topic-granularity variant), systems/redpanda (new 25.3 section), systems/redpanda-connect (MSSQL CDC section), systems/redpanda-iceberg-topics (BigLake section), systems/google-bigquery (BigLake-as-REST- catalog-alternative section). Tier-3 borderline include on vocabulary-canonicalisation grounds — launch/announcement voice, zero production numbers beyond the vendor MSSQL benchmark, ambiguous GA/beta status for Shadowing, but four vocabulary primitives genuinely missing from prior wiki coverage (offset-preserving replication, broker-internal cross-cluster replication, cross-AZ replication bandwidth cost, latency-critical vs latency-tolerant workload classification) plus two net-new features-as-systems (Shadowing, Cloud Topics) plus one net-new CDC engine (SQL Server). Architecture content ~50-60% of body. Cross-source continuity: companion to 2025-02-11 HA stretch-clusters (Shadowing extends the Redpanda DR axis from the two-point stretch/MM2 dichotomy to a three-point stretch/Shadowing/MM2 axis); companion to 2025-03-18 CDC connectors (MSSQL extends the Redpanda Connect CDC engine family from four to five); companion to 2025-04-07 Iceberg Topics GA (BigLake extends the REST-catalog axis from three managed catalogs to four). Caveats: launch-voice; Shadowing mechanism under-specified (wire protocol, conflict resolution, DR-drill mechanics, reverse-replication for failback — all elided); Cloud Topics latency profile undisclosed; cross-AZ-cost claim unquantified; MSSQL CDC benchmark alternative unnamed ("alternative hosted Kafka + CDC service"); MSSQL CDC topology scope not enumerated (Always On AG, mirroring, log shipping unstated); BigLake integration mechanism unwalked; Confluent foil comparison doesn't disclose Kora's own tiered storage capabilities; 25.3 release date not given ("coming soon"); unsigned (Redpanda default attribution).

  • 2025-10-28 — Governed autonomy: The path to enterprise Agentic AICompanion governance-framing post published the same day as Gallego's Introducing the Agentic Data Plane launch; unsigned, shorter (~850 words), marketing-voice restatement of the ADP vision focused on the governance substrate. Two canonical new wiki patterns filling governance- pattern-name gaps the 2025-10-28 launch-post sibling left implicit: (1) Agentic Access Control (AAC) — verbatim: "ADP embeds Agentic Access Control (AAC), an evolution of modern access control concepts tailored to the needs of an agentic workforce. Agents never hold long-lived credentials. Every prompt, action, and output is auditable, replayable, and policy-checked before and after I/O, empowering enterprises to grant AI agents fine-grained, temporary access to sensitive data without losing oversight." Three load-bearing properties: no-long-lived-credentials + per-call-policy-before-and-after-I/O + fine-grained-temporary- access. Composition of three pre-canonicalised substrates (concepts/short-lived-credential-auth, concepts/audit-trail, per-call policy enforcement) specialised for the agent audience. Complements the pre-wired OBO authorization pattern — OBO is the who-is-the-caller mechanism; AAC is the what-policy-applies-to-the-call mechanism. (2) Durable event log as agent audit envelope — verbatim: "The ADP treats every agent interaction as a first-class durable event: prompts, inputs, context retrieval, tool calls, outputs, and actions are captured for analysis, compliance, and replay." Six event classes named (prompt + input + context retrieval + tool call + output + action); one log with N views (audit + lineage + replay + SLO + tracing). Applies log-as- truth at the agent-interaction altitude. A2A protocol first-named alongside MCP as open standards (not unpacked). 3 new canonical wiki pages: source + 2 patterns (AAC + durable-event-log-as-envelope). Extends 10 pages: systems/redpanda-agentic-data-plane (re-sourced as dual-sourced from both 10-28 posts with companion-pair framing), systems/oxla (dual-sourced), systems/redpanda

  • systems/redpanda-connect + systems/redpanda-byoc + systems/redpanda-agents-sdk + systems/model-context-protocol (frontmatter sources), concepts/autonomy-enterprise-agents + concepts/governed-agent-data-access + concepts/data-plane-atomicity + concepts/digital-sovereignty + concepts/short-lived-credential-auth + concepts/audit-trail + concepts/data-lineage + concepts/log-as-truth-database-as-cache + patterns/mcp-as-centralized-integration-proxy (all with new Seen-in entries canonicalising the governance-altitude framing). Tier-3 borderline include on vocabulary- canonicalisation grounds — architecture density ~30% on short body; passes because AAC + event-log-as-audit-envelope
  • ADP + Oxla are vocabulary gaps the pre-wired sibling post didn't fully close. Caveats: zero AAC mechanism depth (no IdP / token-exchange / policy-engine); audit + lineage conflated as "unified audit and lineage envelope" at vision altitude; exactly-once-across-tool-chains asserted without mechanism; replay-for-compliance silent on LLM non-determinism; no byline. Cross-source continuity: dual-post launch pair with Introducing the Agentic Data Plane (Gallego-signed founder-voice productization + Oxla acquisition + four- layer composition + three-shift narrative + OBO-IdP) — together the two posts bracket ADP's canonical wiki definition from architecture + acquisition disclosure (Gallego post) to governance-pattern-naming + audit-envelope architectural claim (this post).

  • 2025-10-28 — Introducing the Agentic Data PlaneFounder-voice productization follow-up to Gallego's 2025-04-03 autonomy essay. Names the commercial shape of enterprise autonomy as the Agentic Data Plane (ADP)"a unified runtime and control plane that safely exposes enterprise data to AI agents" composing four layers: (A) streaming (existing Redpanda broker for HITL + durable model replay + observability); (B) query engine — newly- acquired Oxla, a C++ distributed query engine with PostgreSQL wire protocol + separated compute- storage + Iceberg-native (early preview mid-December 2025); (C) systems/redpanda-connect|300+ connectors rebadged as ADP integration layer; (D) net-new global policy + observability layer. Governance-first framing inverts typical agent marketing verbatim: "The fear from CIOs is not the code of the agent itself, it is governance. In simple terms, it is access controls: can I trust that data is accessed by the right things? And observability: when things go wrong, can I understand what happened?" — canonicalised as concepts/governed-agent-data-access (two-axis design surface). First shipped governance feature: Remote MCP + authentication + authorization for OBO (on-behalf-of) workloads with IdP integration — canonicalised as patterns/on-behalf-of-agent-authorization. Structural foil verbatim: "the new digital workforce often interacts with systems created in the API era of root-token permissions, with all-or-nothing as the norm." Three- shift architectural narrative: compute-storage separationlakehouse → agentic data plane. Open-protocols commitment: MCP, A2A, PostgreSQL wire, durable log (Kafka), Iceberg. Things shipped: Remote MCP + OBO, knowledge-based agent templates (Git/Jira/GDrive), declarative Agent Runtime, Redpanda Streaming for HITL. Things acquired (rolling integration): Oxla. Things doubled down on: governance (access controls + observability). 5 new canonical wiki pages: source + 2 systems (systems/redpanda-agentic-data-plane, systems/oxla) + 1 concept (concepts/governed-agent-data-access) + 1 pattern (patterns/on-behalf-of-agent-authorization). Extends 6 pages: systems/redpanda (new ## Agentic Data Plane (2025-10-28 productization) section), systems/redpanda-agents-sdk (productization-into-ADP section + ADP as product-tier-above-SDK framing), systems/model-context-protocol (frontmatter + related extended for ADP-era MCP usage with OBO), [[patterns/mcp-as- centralized-integration-proxy]] (frontmatter extended), concepts/autonomy-enterprise-agents (new productization section + ADP-as-commercial-packaging framing), companies/redpanda (this entry). Tier-3 borderline include on vocabulary-canonicalisation grounds — marketing- heavy launch post, zero production numbers, but three wiki- load-bearing canonicalisations (ADP-as-product-shape, Oxla-as-system, governed-agent-data-access concept + OBO pattern). Caveats: launch-marketing voice; Oxla mechanism- depth thin (planner/executor/catalog model undisclosed); OBO disclosed as product-line-item not mechanism (token flow, consent vocabulary, downstream-system integration surface not disclosed); A2A protocol named but not described; no competitive comparison with Databricks Unity AI Gateway / AWS Bedrock Agents / Snowflake Cortex. Gallego-signed ("handcrafted by a hooman. .alex").

  • 2025-10-02 — Real-time analytics at scale: Redpanda and Snowflake StreamingVendor benchmark of a 9-node Redpanda + 12-node Redpanda Connect → single Snowflake table pipeline via the snowflake_streaming output connector. Headline: 3.8 billion 1 KB AVRO messages at 14.5 GB/s, P50 ≈ 2.18 s / P99 ≈ 7.49 s end-to-end — exceeds Snowflake's documented 10 GB/s per-table ceiling by 45%. Disaggregated latency attribution: 86% of the P99 budget (~6.44 s) is in the Snowpipe-Streaming upload / register / commit path, not in Redpanda read or transport. Four canonical tuning insights: (1) AVRO over JSON = ~20% throughput uplift (patterns/binary-format-for-broker-throughput); (2) count-based batch triggers beat byte-size triggers on the hot path because byte-size requires per-message size computation (patterns/count-over-bytesize-batch-trigger); (3) build_paralellism tuned to (cores − small reserve) — 40 on 48-core nodes — as the Snowpipe-Streaming commit-path latency knob (concepts/build-parallelism-for-ingest-serialization); (4) Snowpipe-Streaming channels are the per-table parallelism unit controlled by channel_prefix × max_in_flight with a hard ceiling of 10,000 channels per table — exceeding surfaces as "the Snowpipe API screaming at us" (concepts/snowpipe-streaming-channel). Decisive scaling dimension: intra-node input/output parallelism via the broker primitive — running many parallel pipelines within one Connect process to saturate per-node resources, canonicalised as patterns/intra-node-parallelism-via-input-output-scaling. Control-group (Redpanda → drop sink) ceiling 15.1 GB/s at 8.38 ms P99; Snowflake commit added ~1 min wall-clock and ~7.5 s P99. Public-internet transport; PrivateLink would reduce further. Borderline-case include on architectural-disclosure grounds: real operational numbers (cluster topology, P50/P99 latencies, per-step attribution) and four first-party tuning findings at mechanism depth.

  • 2025-06-24 — Why streaming is the backbone for AI-native data platformsThought-leadership / vision essay originally syndicated to The New Stack, positioning streaming as the "power grid" of an AI-native data platform. Canonicalises four architectural propositions the wiki had referenced implicitly: (1) streaming-as-backbone of an agile data platform (producer / consumer decoupling + dynamic source/sink add + real-time reactivity) — new concept concepts/streaming-as-agile-data-platform-backbone; (2) CDC fan-out from a single stream to many consumers (search, analytics, vector index, reactive agent) with the user_plans downgrade-trigger worked example and explicit WAL-cleanup-strain trade-off — new pattern patterns/cdc-fanout-single-stream-to-many-consumers; (3) Replayability for iterative RAG — long-lived tiered-storage streams let you re-run historical data through different embedding models or chunking strategies without re-extracting from source — new concept concepts/stream-replayability-for-iterative-pipelines; (4) Open table format = freedom to pick the query engine — Iceberg as the escape hatch from warehouse lock-in, with Snowflake + BigQuery sharing the same dataset via Apache Polaris REST catalog without storing data twice. Also canonicalises schema registry as CI/CD / IaC artefact (PR-time validation, code-owned contracts) and discloses OpenTelemetry context propagation via Kafka record headers as the streaming-boundary analogue of HTTP-header propagation — extending systems/opentelemetry from the Fly.io application-RPC framing. Also names stateless transformation at broker ingress (compliance / masking) and the AI data flywheel (usage → insights → product → usage). 3 canonical new wiki pages: concepts/streaming-as-agile-data-platform-backbone, concepts/stream-replayability-for-iterative-pipelines, patterns/cdc-fanout-single-stream-to-many-consumers. Extends 7 pages: concepts/change-data-capture (new Seen-in canonicalising fan-out topology + WAL-cleanup trade-off + user_plans worked example), concepts/schema-registry (new Seen-in canonicalising CI/CD-IaC-artefact framing — registry as API contract between teams, equivalent to HTTP API contract for sync services), systems/opentelemetry (new Seen-in canonicalising Kafka-record-headers carrier for context propagation at the streaming boundary), patterns/streaming-broker-as-lakehouse-bronze-sink (new Seen-in at vision altitude extending the 2025-01 pedagogy altitude and 2025-04-07 GA-release altitudes), patterns/tiered-storage-to-object-store (new Seen-in canonicalising third axis — economic precondition for replayability — beyond prior capacity + decommission-speed framings), systems/apache-iceberg (new Seen-in canonicalising open-format-escape-hatch from warehouse lock-in + Polaris REST catalog), systems/redpanda-iceberg-topics (new Seen-in at backbone altitude). Tier-3 borderline include. Redpanda vendor voice with heavy product-link density (≈30 blog cross-links to own marketing pages), but architecture content is ~50% of ~1,700-word body and the four propositions above are structurally load-bearing vocabulary the wiki did not previously canonicalise (the backbone framing, the fan-out-from-single-CDC-stream framing, the replayability-for-RAG framing, and the schema- registry-as-CI/CD-artefact framing were all gaps). Passes on vocabulary-canonicalisation grounds even with the marketing- adjacent voice. Cross-source continuity: companion to Gallego 2025-04-03 autonomy essay from the same quarter (Gallego = streaming + MCP + Python SDK as agent substrate; this post = streaming + CDC + Iceberg as AI-data-platform substrate — the agent-substrate and data-substrate halves of the same vision, framed for complementary audiences). Companion to sources/2025-01-21-redpanda-implementing-the-medallion-architecture-with-redpanda|2025-01-21 Medallion architecture post at vision altitude vs mechanism altitude. Companion to sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture|2025-03-18 CDC connectors post — that post canonicalises the CDC reader half of the fan-out pattern; this post canonicalises the consumer-fanout half. Caveats recorded: zero production numbers (no fleet sizes, no latency distributions, no before/after quantitative wins between batch-ETL and streaming); qualitative claims only ("much more effective", "saves you from costly reprocessing"); Iceberg vs Snowpipe-Streaming trade-off named but uncompared on cost / ecosystem / governance; CDC-WAL-cleanup nuance name-only ("delaying WAL cleanup" with no slot-management or retention mechanics); AIOps name-drop without mechanism; no cross-vendor comparison (Kafka / Pulsar / Kinesis / Pub/Sub not compared); unsigned (Redpanda default attribution); originally syndicated to The New Stack as "the power grid for AI-native data platforms" — wiki-version ingest uses the canonical redpanda.com URL.
  • 2025-06-21 — Behind the scenes: Redpanda Cloud's response to the GCP outageProduction-incident retrospective on the 2025-06-12 GCP global outage from Redpanda Cloud's perspective. ~3-hour incident window (18:41–21:38 UTC); SEV4 incident closed with no customer impact across hundreds of clusters. Load-bearing disclosures: (1) cell-based architecture as an explicit Redpanda Cloud product principle — single- binary broker + per-customer cluster, "Redpanda Cloud clusters do not externalize their metadata or any other critical services"; (2) butterfly effect named as first-class system-design primitive"GCP's seemingly innocuous automated quota update triggered a butterfly effect that no human could have predicted"; (3) feedback-control- loop-guarded phased rollouts as the change-management discipline — "we try to close our feedback control loops by watching Redpanda metrics as the phased rollout progresses and stopping when user-facing issues are detected"; (4) hedged observability stack — self-hosted data + third-party UI was degraded-but-usable during cascading outage, saving "exponentially bigger cost ramifications" of a vendor failover; (5) SLA substrate decomposition — 99.99% SLA + ≥99.999% SLO decomposes to six concrete choices (replication ≥3, local NVMe primary + async tiered storage, redundant API/Schema Registry/HTTP Proxy, no external critical-path dependencies except PSC, continuous chaos + load testing, feedback-gated phased rollouts); (6) tiered storage as fallback, not primary — elevated GCS PUT error rates did not impact write availability because primary data is on local NVMe; (7) deliberate disk reserve (unused + used-but-reclaimable) absorbs flush backlog during object-store stress. Canonicalises four new patterns: patterns/cell-based-architecture-for-blast-radius-reduction, patterns/preemptive-low-sev-incident-for-potential-impact (19:08 UTC SEV4 declared before customer impact observed), patterns/proactive-customer-outreach-on-elevated-error-rate (20:56 UTC outreach to customers with highest tiered-storage error rates), and patterns/hedged-observability-stack. One affected cluster (staging, us-central-1, lost node + ~2h replacement) — out of hundreds; customer's production cluster unaffected. Closing thoughts draw a CrowdStrike parallel and argue for "increased adoption of control theory in our change management tools" as an industry-wide reliability practice. Tier-3 on-scope — production-incident retrospective (not marketing) with architecture-density ~60% across timeline + substrate decomposition + six-mitigation reliability practice list. Opens the Redpanda incident-retrospective axis on the wiki. Companion to the 2025-04-03 Gallego autonomy essay (which canonicalised the Data Plane Atomicity invariant) by instantiating the cell- based-architecture deployment shape that operationalises it. Caveats: unsigned, vendor-voice, hindsight-bias acknowledged; single-affected-cluster mechanism underspecified ("uncommon interaction between internal infrastructure components"); no quantitative tiered-storage error-rate metrics; third-party dashboarding/alerting vendor + cloud-marketplace vendor both unnamed; disk-reserve sizing policy undisclosed; PSC exception to no-critical-path-dependencies load-bearing but not walked; phased-rollout-with-feedback-control implementation details absent.

  • 2025-06-17 — Introducing multi-language dynamic plugins for Redpanda ConnectLaunch of the dynamic-plugin framework in Redpanda Connect v4.56.0 (Beta, Apache 2.0). Breaks the Go-only, compile-into-the-binary plugin constraint: plugins now run as separate OS subprocesses communicating with the host Redpanda Connect engine over gRPC on a Unix domain socket, with the cross-process protocol "closely mirroring the existing interfaces defined for plugins within Redpanda Connect's core engine, Benthos". Four canonical new wiki pages: systems/redpanda-connect-dynamic-plugins + two concepts (concepts/subprocess-plugin-isolation"plugins run in separate processes, so crashes won't take down the main Redpanda Connect engine"; [[concepts/batch-only-component-for-ipc- amortization]] — "we use batch components exclusively to amortize the cost of cross-process communication" — only BatchInput / BatchProcessor / BatchOutput types are exposed across the gRPC boundary) + two patterns (patterns/grpc-over-unix-socket-language-agnostic-plugin as the architectural shape; [[patterns/compiled-vs-dynamic-plugin- tradeoff]] capturing the explicit "compiled plugins for performance-critical, dynamic plugins for flexibility and language choice" guidance — dynamic plugins are additive, not a replacement for compiled plugins). Language SDKs: Go (type-safe, for existing Redpanda Connect developers) and Python (headline target — opens the streaming substrate to PyTorch / TensorFlow / Hugging Face / LangChain / NumPy / SciPy for real-time ML inference inside the pipeline). Motivating use case in the post: a Python processor plugin running a pre-trained BERT model from Hugging Face for sentiment analysis on streaming customer feedback. Launch is Apache 2.0 — the plugin framework itself is open-source; connectors built on top may carry different licenses (contrast: 2025-03-18 CDC input connectors were Enterprise-gated). Extends systems/redpanda-connect with a new ## Dynamic plugins (2025-06, Beta, Apache 2.0) section. Tier-3 borderline include: launch / marketing voice with "We're excited..." framing, but architecture content is real — core technical disclosure is the subprocess + gRPC + Unix-socket

  • batch-only amortization design. Passes on vocabulary- canonicalisation grounds — four plugin-architecture primitives (subprocess isolation, batch-only IPC amortization, gRPC-over- Unix-socket language-agnostic plugin shape, compiled-vs-dynamic tradeoff) missing from prior wiki coverage. Caveats: Beta stability only (v4.56.0; protocol stability across minor versions not guaranteed); no gRPC .proto published inline (the protocol "closely mirrors" Benthos interfaces but implementors must consult the SDK source); no performance numbers (no throughput delta vs compiled plugins, no cross-process hop p99, no reference-workload benchmarks); no process-lifecycle details (crash recovery, socket cleanup, supervisor shape unspecified); no horizontal-scaling model for CPU-bound plugins (one subprocess per plugin, no pooling). Opens the Redpanda-Connect extensibility-framework axis on the wiki — prior Redpanda Connect coverage focused on the shipped connector catalog (CDC input connectors, MCP-tool surface); this is the first canonicalizing the developer-surface / plugin-architecture axis.
  • 2025-05-20 — Implementing FIPS compliance in RedpandaConfiguration-walkthrough disclosure of broker-level FIPS 140 compliance in self-managed Redpanda clusters on RHEL. Opens the Redpanda security-substrate axis on the wiki. Three load-bearing canonicalisations: (1) OpenSSL 3.0.9 as the FIPS 140-2 validated cryptographic module consumed by both the redpanda broker binary and the rpk CLI, with OpenSSL 3.1.2 (FIPS 140-3 validated) on the late-2025 upgrade roadmap ahead of 140-2 sunset. (2) Three-state fips_mode config dial (disabled / enabled / permissive) distinguishing production (OS-FIPS + broker-FIPS), non-regulated (no FIPS), and development (broker-FIPS-only, non-production) deployment shapes. permissive is explicitly scoped out of compliance claims — entropy sourcing from a non-FIPS OS breaches the boundary even with broker-level controls. (3) Broker-startup fail-fast as the enforcement shape: "Redpanda will log an error and exit if the underlying operating system isn't properly configured." Structurally stronger than the logging-then-enforcement progressive-rollout shape — regulated workloads have no warn-only regime by design. Extends concepts/fips-cryptographic-boundary: the Redpanda instance surfaces at streaming-broker-startup / validated-module altitude where the boundary manifests as a two-package artefact split (redpanda-fips + redpanda-rpk-fips co-installable with base packages) + three-state config dial + startup enforcement gate — a different architectural layer from the GitHub 2025-09-15 PQ-SSH instance where the boundary manifests as a filtered primitive-advertisement list on the SSH wire. Deployment scope at publication (2025-05-20): self-managed RPM / Debian on RHEL only; Redpanda Cloud, Kubernetes deployments, and Redpanda Connect on roadmap — canonical wiki instance of the FIPS boundary being narrower than a product's full deployment surface because validated- module distribution is deployment-shape-specific. Redpanda Ansible Collection accepts enable_fips=true + fips_mode=enabled opt-in variables. Batch-skip override per explicit user full-ingest instruction; raw frontmatter carried ingested: true + skip_reason: batch-skip — zero architecture signals in 7896-char body (pure marketing). Post is short (~1,100 words), configuration-walkthrough voice, but canonicalises three compliance-substrate primitives missing from wiki's prior FIPS coverage (anchored only on the GitHub PQ-SSH rollout). Caveats: no wire-protocol disclosure (which ciphers/KEX/MACs filtered in FIPS mode not enumerated); FIPS 140-3 transition schedule underspecified (no formal NIST 2026-02-22 sunset date); permissive failure surface beyond entropy not enumerated; non-RHEL OS coverage elided; license-gated; no byline; no benchmarks on FIPS-mode overhead.
  • 2025-05-13 — Getting started with Iceberg Topics on Redpanda BYOCBYOC-beta extension of Iceberg Topics five weeks after 25.1 GA on Dedicated, with a GCS + BigQuery worked example. Three new primitives canonicalised: (1) the per-topic mode configuration surfacevalue_schema_id_prefix (Schema-Registry-wire- format producers → typed Iceberg table), value_schema_latest (latest-schema projection), key_value (schema-less BYTES + Kafka metadata); (2) the file-based catalog as a first-class alternative to REST catalog sync for engines (like BigQuery) that read Iceberg via metadata- pointer DDL; (3) the BYOC-data-ownership compound property — customer-owned bucket + broker-projected Iceberg + customer-owned query engine yields "full control of your Iceberg data with zero compromises". Read-side pattern: BigQuery CREATE EXTERNAL TABLE ... format = 'ICEBERG' on a GCS-hosted vN.metadata.json. Adjacent secondary disclosure: Redpanda BYOC doubles partition density per tier in 25.1 via per-partition memory efficiency improvements (Tier 1: 1,000 → 2,000; Tier 5: 22,800 → 45,600), canonicalised as concepts/broker-partition-density. Tutorial altitude with synthetic Protobuf SensorData generator via Redpanda Connect. Tier-3 borderline-on-scope: vendor tutorial, no production numbers, architecture content ~25-30% of body concentrated on the three new primitives + partition-density datum. Passes on vocabulary-canonicalisation grounds (topic-mode configuration, file-based catalog, and BYOC-data-ownership were all gaps in the wiki). Caveats: file-based-catalog mechanism underspecified vs object-store-catalog fallback from GA; partition-density 2× improvement mechanism unexplained; value_schema_id_prefix vs value_schema_latest vs key_value trade-offs elided; DLQ
  • schema-evolution not re-invoked in BYOC context; Protobuf-specific guidance thin; tier dimensions opaque.
  • 2025-05-06 — A guide to Redpanda on Kubernetes — Product-altitude guide to Redpanda's Kubernetes deployment evolution. Three load-bearing architectural claims: (1) Helm vs Redpanda Operator trade-off on five axes — managed upgrades + rollback, dynamic configuration (CRDs vs Helm-values redeploy), advanced health checks + metrics, lifecycle automation, multi-tenancy. Operator is the default recommendation; Helm chart retained for simpler deployments. (2) Two-to-one operator consolidation — Redpanda previously shipped separate operators for its internal Redpanda Cloud fleet and for customer-facing Self-Managed deployments; the 2025 unification merges them into a single operator (patterns/unified-operator-for-cloud-and-self-managed). (3) FluxCD bundling reversal — the customer operator initially bundled FluxCD internally to wrap the Helm chart; canonical wiki instance of the bundled-GitOps- dependency anti-pattern. Fix across three branches: v2.3.x FluxCD optional (spec.chartRef.useFlux) → v2.4.x (Jan 2025) FluxCD disabled by default → v25.1.x FluxCD and Helm-chart wrapping removed entirely. v25.1.x adopts the version-aligned compatibility scheme — operator/chart version matches Redpanda core version with ±1 minor window, retiring the compatibility matrix document. Introduces systems/redpanda-operator as a canonical wiki system and systems/fluxcd as a minimal page. Tier-3 batch-skip override: raw frontmatter carried ingested: true + skip_reason: batch-skip — marketing/tutorial slug pattern; overridden per explicit user full-ingest instruction. Architecture density ~40% on ~1,400-word body. Caveats: product-guide altitude, no production numbers, FluxCD-removal migration path underspecified, deprecation schedule opaque, unified-operator cutover mechanism not disclosed, multi-region K8s limitation (multi-AZ-only) not revisited. Closes a gap in the wiki's Kubernetes-operator corpus by canonicalising two anti-patterns (bundled GitOps, compatibility matrix) that generalise beyond Redpanda.
  • 2025-04-23 — Need for speed: 9 tips to supercharge Redpanda — Omnibus performance-tuning checklist covering nine tips across three dependency layers — infrastructure (NVMe, dedicated hardware with 95% resource budget, no noisy neighbors; enable broker-side write caching when NVMe isn't available), data architecture (partition skew as Amdahl's Law with three-pronged mitigation — sticky partitioner / keyed only when required / high-cardinality keys; don't compress compacted topics; use tiered storage for fast rebalance), and application design (producer batching, consumer fetch tuning matrix with fetch.min.bytes / fetch.max.wait.ms / max.partition.fetch.bytes / max.poll.records, offset-commit cost / save-button analogy / RPO-as-commit-frequency, client-side compression with ZSTD or LZ4 codec choice). Introduces concepts/keyed-partitioner, patterns/high-cardinality-partition-key, and patterns/client-side-compression-over-broker-compression. Tier-3 borderline-on-scope: vendor-blog checklist voice but substantive gap-filling across six previously uncanonicalised primitives. No author byline, no production numbers, no customer case study.

  • 2025-04-07 — Redpanda 25.1: Iceberg Topics now generally availableGA release disclosure for Iceberg Topics across AWS, Azure, and GCP (framed as "first in industry" Kafka-Iceberg streaming solution GA on multiple clouds). Elaborates the 2025-01-21 pedagogy launch with nine disclosed properties beyond the preview framing. Four table-management capabilities: custom hierarchical bucketed partitioning (operator- controllable Iceberg transforms for query-side pruning); built-in dead-letter queues for schema-invalid records (keeps data- quality invariant without dropping batches); full Iceberg- spec-compliant schema evolution (adds/renames/deletes matching the Iceberg spec); automatic snapshot expiry as a broker-owned metadata-GC loop (retires the wiki's prior externalisation-cost caveat for the snapshot-expiry half; small-file compaction ownership remains open). Five catalog-integration capabilities: secure REST catalog sync via OIDC+TLS against Snowflake Open Catalog / Databricks Unity / AWS Glue; transactional writes via Iceberg's commit-protocol serialisation for safe concurrent multi-writer access; automatic table discovery and registration so downstream engines see new Iceberg-configured topics appear without manual CREATE TABLE; built-in object-store catalog fallback for deployments without a REST catalog; tunable workload management knob for the snapshot-vs-live-topic lag ceiling (making the commit-cadence lag floor an explicit operational parameter). Adjacent 25.1 features: native consumer group lag metrics (Prometheus-exposed, replacing a PromQL compute), Protobuf schema normalization in the Schema Registry, SASL/PLAIN authentication, unified Console+cluster identity with fine-grained RBAC, and FluxCD removal for Kubernetes deployments. Tier-3 borderline-on-scope: vendor launch post, but GA feature disclosure is architecturally substantive — retires two prior wiki caveats (snapshot-expiry ownership, Iceberg-spec-schema-evolution path) and canonicalises three new concepts + two new patterns. Architecture density ~40% on ~1,900-word body. Caveats: vendor framing throughout; "first in industry" unqualified; DLQ operational surface under-specified; transactional-write isolation level unstated; tunable workload management knob name / default / range not disclosed.

  • 2025-04-03 — Autonomy is the future of infrastructure — Alex Gallego's (founder/CEO) vision essay marking the $100M Series D + Redpanda Agents SDK preview launch. Frames the 20-year systems trajectory (single-node DB → managed SaaS → streaming/log substrate → Iceberg continuous-computation handshake → agent orchestration). Canonicalises Redpanda's founding premise "the truth is the log" (Kleppmann 2015), the send-model-to-data enterprise-AI thesis, the batch/streaming convergence framing, and the frontier-model + local-GPU-minion hybrid. Centerpiece: canonical founder- voice retrospective statement of Data Plane Atomicity as BYOC's central design tenet — "no deployment should be able to bring down any other deployment, including a control plane outage... No externalized consensus algorithm, secret managers, no external databases, no external offset recording service, or metadata look up as you are trying to write your data durably to disk." Reframes MCP from tool-description format to centralised integration proxy, with dynamic Redpanda Connect pipeline filtering (Bloblang + Starlark) as the future fine-grain-ACL mechanism. Introduces three new Redpanda systems on the wiki: systems/redpanda-byoc, systems/redpanda-agents-sdk, plus extends systems/redpanda-connect. Operational numbers: ~300 connectors, ~10× price-performance for fine-tuned small models, single-GPU inference for Llama3/Gemma3/DeepSeekV3/Phi-4, three-cloud BYOC (AWS/GCP/Azure) preview scope. Tier-3 borderline-on-scope; founder-voice vision essay + product-launch hybrid; architecture density ~50% concentrated on Data Plane Atomicity tenet + MCP-as-proxy reframing + log-as-truth founding premise.
  • 2025-03-18 — sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture|3 powerful connectors for real-time change data capture — Product-altitude tour of Redpanda Connect's four CDC input connectors (postgres_cdc, mysql_cdc, mongodb_cdc, gcp_spanner_cdc), each riding on the source database's native change log: Postgres logical replication + replication slot / MySQL binlog with external offset cache / MongoDB change streams + oplog / Spanner change streams with transactional offset storage and dynamic partition split/merge handling. Canonicalises parallel snapshot of a single large table or collection as the Redpanda differentiator vs stock Debezium: "Debezium (Kafka Connect) does not do this today." Ships the parallel-snapshot capability in the Postgres and MongoDB connectors; MySQL and Spanner connectors don't have it at publication. MySQL CDC topology scope explicitly limited (no GTID, no Group Replication, no multi-source). Second canonical wiki instance of CDC driver ecosystem — from the consumer-side (Redpanda Connect writes drivers against every source database's native CDC API), bracketing the Vitess-VStream emitter-side instance already canonicalised. Tier-3 on-scope on engine-mechanism canonicalisation grounds; architecture density ~60% of a feature-tour post; four new canonical concept pages + one new system page + one sub-concept pattern extension. Enterprise-license-gated in Redpanda Cloud + Self-Managed.
  • 2025-02-11 — High availability deployment: Multi-region stretch clusters — Part four of Redpanda's HA/DR series. Canonicalises the multi-region stretch cluster as the RPO=0 shape (single Redpanda cluster spans regions; per-partition Raft quorum on every write; automatic leader re-election on region loss). Positions it on the consistency-vs-availability axis against MirrorMaker2 async two-cluster replication (non-zero RPO, per-cluster availability). Canonicalises four operator knobs for cross-region cost mitigation: leader pinning (enterprise feature; bias leadership to client-proximal region), acks=1 (producer durability relaxation), follower fetching (KIP-392 closest-replica consume), remote read replica topic (object-storage-backed read-only mirror cluster). Publishes a three-broker Ansible hosts.ini template with region-as-rack (rack=us-west-2, rack=us-east-2, rack=eu-west-2) and an OMB + tc-inter-broker-latency-injection simulation technique for multi-region performance testing without paying cross-region cloud bandwidth. Current limitation: Self-Managed on K8s is multi-AZ only; multi-region stretch is available on VMs / bare metal / cloud compute / Redpanda Cloud.
  • 2025-01-21 — Implementing the Medallion Architecture with Redpanda — pedagogy-altitude explainer on Databricks' three-tier Bronze/Silver/Gold data-lake pattern, positioning Redpanda's Iceberg topics as the mechanism that makes the streaming broker serve as the Bronze layer of a lakehouse without any external ETL (Airflow / Kafka Connect / Redpanda Connect). Canonicalises concepts/medallion-architecture, concepts/data-lakehouse, concepts/iceberg-topic, concepts/open-file-format on the wiki. Names Flink's Iceberg sink connector as the mechanism for real-time Bronze→Silver→Gold transitions (patterns/stream-processor-for-real-time-medallion-transitions). Tier-3 pedagogy altitude; no production numbers; no compaction-ownership / commit-cadence latency numbers.
  • 2024-12-03 — Redpanda 24.3 extends lakehouses with streaming data & CDCRedpanda 24.3 release roundup (unsigned, ~1,400 words). Origin-point announcement post for six primitives the wiki has subsequently canonicalised under later-release ingests: (1) Iceberg Topics beta on self-managed Enterprise + Redpanda Cloud BYOC (non- production-only at beta; GA five months later in 25.1); BYOC- first framing and per-topic opt-in model ("The integration works on a per-topic basis") are both present from launch. (2) Mountable Topics — zero-data-loss unmount / mount of unused tiered- storage topics via rpk CLI + Redpanda Cloud API, "even in a new cluster and with a new name". Canonicalised as concepts/mountable-tiered-storage-topic + patterns/hibernate-unused-topics-on-tiered-storagefirst wiki source for this primitive. Generalises tiered-storage fast-decommission from broker- to topic-lifecycle altitude. (3) Leader pinning (write- path locality) and follower fetching (read-path locality) announced together as duals — "Leader pinning complements follower fetching, which lowers costs for consumers with geographically-optimized reads." Earliest wiki-visible source for both concepts; preserves the prior 2025-02-11 HA-ingest's mechanism depth as the second citation. Enterprise-licensed from launch on both self-managed Enterprise and Redpanda Cloud BYOC + Dedicated. (4) postgres_cdc beta in Redpanda Connect"the beginning of a larger CDC effort ... optimized for Redpanda Connect's native Go (vs. Debezium's Java)" — first engine in the family that grew through 2025-2026 to six engines; MySQL flagged as next up verbatim. Earliest wiki source for the Redpanda Connect CDC family. (5) rpk connect --secrets for runtime interpolation of credentials from AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, and Redis — canonicalised as patterns/external-secrets-manager-interpolation (new canonical pattern). Removes "different sets of environment variables or embedding credentials in your YAML" failure-modes. Fully-managed console variant previewed but not shipped. (6) Redpanda Migrator offset translation — per-consumer-group offset-translation map for cross-cluster failover between source and target clusters. Canonicalised as concepts/cross-cluster-offset-translation-map (new canonical concept); architectural foil for the later 25.3 Shadowing feature which eliminates the translation map via byte-for-byte offset preservation. The wiki now tracks the full two-point history of the Redpanda cross-cluster-consumer-failover axis. Supporting disclosures: Customer-Managed VNets on Azure extend BYOC customer-network control from AWS + GCP to all three hyperscalers; Azure Marketplace launch (Dedicated clusters, annual commitment, East US / North Europe / UK South regions); 99.99% uptime SLA on multi-zone BYOC + Dedicated; Redpanda Terraform provider public beta for clusters / topics / users / ACLs / networks / resource groups; scheduled maintenance windows; new-AI-processor connectors (Cohere, Amazon Bedrock + Google Cloud Vertex AI embeddings); Bloblang Playground; Ockam + Timeplus partner connectors (Community-tier). 5 new canonical wiki pages: source + 2 systems (systems/redpanda-migrator stub) + 2 concepts (concepts/mountable-tiered-storage-topic, concepts/cross-cluster-offset-translation-map) + 2 patterns (patterns/hibernate-unused-topics-on-tiered-storage, patterns/external-secrets-manager-interpolation). Extends 8 pages: concepts/leader-pinning + concepts/follower-fetching + patterns/client-proximal-leader-pinning + patterns/closest-replica-consume (all gain 2024-12-03 as earliest-wiki-visible origin source); concepts/iceberg-topic + systems/redpanda-iceberg-topics (gain origin-point BYOC-first-beta Seen-in entries predating the 2025-01-21 pedagogy post); concepts/change-data-capture + systems/redpanda-connect (gain origin-point first-engine + secrets-manager Seen-in entries); systems/redpanda + systems/redpanda-byoc (gain 24.3-release Seen-in entries). Tier-3 borderline include on origin-point- canonicalisation grounds — vendor-launch voice throughout, zero production numbers / benchmarks / customer case studies beyond the 99.99% SLA claim. Architecture density ~25-30% on a roundup-format body — six of eight disclosed betas are wiki-load-bearing. Caveats: every claim is capability-statement altitude; Mountable Topics unmount/mount protocol not walked; leader-pinning + follower- fetching mechanisms deferred to 2025-02-11; Postgres CDC replication-mode enumeration deferred to 2025-03-18; secrets- manager backend-auth discipline + refresh cadence undisclosed; Migrator offset-translation-map storage substrate not walked. Cross-source continuity: direct pre-cursor to 2025-01-21 Medallion pedagogy (Iceberg Topics elaboration), 2025-02-11 stretch clusters (leader pinning + follower fetching mechanism), 2025-03-18 CDC connectors tour (the full CDC family), 2025-04-07 Iceberg Topics GA, and 2025-11-06 25.3 Shadowing launch (Migrator-as-architectural- foil). First-engine position of the CDC family later extended through 2026-04-09 to six engines. Sibling to 2025-05-13 BYOC Iceberg Topics tutorial on the BYOC deployment-shape axis.
  • 2024-11-26 — Batch tuning in Redpanda to optimize performance (part 2) — James Kinley's operations-manual companion to part 1. Canonicalises four Prometheus private metrics (vectorized_storage_log_written_bytes, vectorized_storage_log_batches_written, vectorized_scheduler_queue_length, redpanda_cpu_busy_seconds_total) + five PromQL one-liners + the 4 KB NVMe-alignment batch-size floor + the write-caching broker feature + a real customer case study showing p99 128 ms → 17 ms and 2-cluster → 1-cluster consolidation at ~2.2× per-cluster throughput.
  • 2024-11-19 — Batch tuning in Redpanda for optimized performance (part 1) — James Kinley's first-principles explainer on producer-side batching. Canonicalises the fixed-vs-variable request-cost framing, the linger.ms / batch.size / buffer.memory trigger logic, and the seven-factor effective-batch-size framework.
Last updated · 542 distilled / 1,571 read