Skip to content

Redpanda

Redpanda (originally Vectorized, rebranded 2021) is a streaming- platform company whose flagship product is a C++ rewrite of a Kafka-API-compatible broker built on the thread-per-core Seastar framework. The company blog (redpanda.com/blog) covers a mix of product announcements, benchmarks, tutorials, and occasional first-principles technical explainers on streaming-broker internals.

Tier classification

Tier 3 on the sysdesign-wiki. The blog is a mix of: - Product PR and launch announcements ("Redpanda 24.3 extends...", "Announcing Redpanda Cloud") — skip unless they disclose real architectural content. - Consultative / industry tutorials ("What is real-time data processing", "Create a real-time analytics pipeline") — skip unless they cover distributed-systems internals, scaling trade-offs, production incidents. - First-principles substrate explainers (e.g. this batch-tuning series by James Kinley) — ingest; these are the Kafka-API substrate posts that the wiki's Redpanda + Kafka coverage depends on. - Company-culture / hackathon / sales posts — skip.

Apply the generic tier-3 filter: skip unless the post explicitly covers distributed-systems internals, scaling trade-offs, infrastructure architecture, production incidents, storage / networking / streaming design.

Key systems

Key patterns / concepts

Recent articles

  • 2026-04-14 — Openclaw is not for enterprise scaleRedpanda unsigned rhetorical-voice governance essay (~1,200 words) arguing that Claude-Code-class local coding agents ("Openclaw" category stand-in) work for personal dev laptops but fail at enterprise scale because the sandbox doesn't solve the underlying credential-holding, audit, and egress-control problems. Opens with a HackerNews comment re-framing the sandbox-for-agents problem as "giving your dog a stack of important documents, then being worried he might eat them, so you put the dog in a crate, together with the documents" — a memorable framing the post carries through as its architectural thesis. Load-bearing canonicalisation: the closing formula Gateway + Audit trail + Token vault + Sandboxed compute = Agents in production as the minimum architectural bar for enterprise agent deployment. Each component solves a failure mode the others can't: (1) Gateway ( central proxy choke point) — single choke point for agent egress, observability, rate limits, kill switch — "turn it off for a single service or set of services for your entire digital workforce at once". (2) Audit log + transcripts"why and how, not just what", with "inputs, outputs, tool calls, token usage, and the agent's reasoning chain" captured; adds agentic performance review as a new use case for the durable event log audit envelope. (3) Token vault (new canonical concept) — out-of-band credential broker that mints short-lived scoped tokens per operation. The agent never holds the credentials; "Don't give the dog your keys." Canonical OBO substrate for user-auth-only systems (Salesforce, ServiceNow) — "You can't build a real multi-tenant agent without this." (4) Sandboxed compute with gateway-only egress (new canonical pattern) — sandboxes are "right" (LLMs need Unix composability for tool-output post-processing) provided egress is choke-pointed at the gateway and auth comes from out-of-band agent-identity metadata, not files inside the sandbox. Redpanda-specific mechanism: agi CLI (new canonical system)"agentic gateway interface", a dynamic self-describing CLI inside the sandbox that mediates agent→gateway calls while preserving Unix-workflow composability. "Yes, the name is a play on that AGI." First wiki mention; demonstration-altitude, not shipping product. Threat-model-at-scale argument: "If you're a developer running it on a dedicated machine with limited access and scope, the threat model is manageable [...] The problem shows up when organizations try to scale that model. When the IT team decides 'just run it in a VM' for each department. When someone decides the sandbox is sufficient governance for production use. It isn't." Canonicalises sandbox-adequate-for-personal-use-breaks-at-enterprise- scale as the structural argument for the four-component stack. 3 new canonical pages: concepts/token-vault + patterns/four-component-agent-production-stack + patterns/agent-sandbox-with-gateway-only-egress + systems/redpanda-agi-cli. Extends 6 pages: patterns/central-proxy-choke-point (kill-switch added as canonical choke-point capability; agent-workforce-scale instance added); patterns/agentic-access-control ("Don't give the dog your keys" framing + token-vault substrate reinforcement); patterns/on-behalf-of-agent-authorization (token-vault named as OBO substrate for user-auth-only systems); patterns/durable-event-log-as-agent-audit-envelope (transcripts + A/B agent evaluation as new use cases); concepts/audit-trail (transcripts + reasoning-chain as why-and-how audit shape); concepts/short-lived-credential-auth (per-operation minting canonicalised via token-vault). Tier-3 borderline include on pattern-crystallisation + new-system grounds — zero production numbers, zero mechanism depth on the four components, but crystallises prior governance patterns into a quotable architectural formula and introduces the agi CLI as a distinct system. Cross-source continuity: sequel to 2025-10-28 ADP launch + companion governance-framing post; safety-side companion to 2025-04-03 Gallego autonomy essay; sibling to 2026-02-10 Akidau talk-recap (four-component stack compresses six of Akidau's eight axes). Caveats: rhetorical-voice essay not architecture deep-dive; "Openclaw" is a product-family stand-in (not a real product, myclaw.ai is a rhetorical placeholder); token-vault protocol / software not named; agi CLI is a "demonstration", no repo / license / availability; kill-switch trigger UX not walked; sandbox escape + prompt injection explicitly out of scope.
  • 2026-04-09 — Oracle CDC now available in Redpanda ConnectRedpanda unsigned launch post (~900 words) announcing the oracledb_cdc input in Redpanda Connect v4.83.0 (enterprise-gated). Adds Oracle as the sixth source-database engine in Redpanda's per-engine CDC family (Postgres / MySQL / MongoDB / Spanner / MSSQL / Oracle). Four load-bearing architectural disclosures: (1) rides on Oracle LogMiner — the Oracle Enterprise Edition redo-log-mining utility — canonicalised as concepts/oracle-logminer-cdc, sibling to Postgres logical replication / MySQL binlog / MongoDB oplog / Spanner change streams / SQL Server change tables. No additional Oracle licensing required beyond Enterprise Edition. (2) In-source checkpointing"Restarts resume from a checkpointed position stored in Oracle itself, with no external cache required, no re-snapshot, and no gaps." Fourth canonical offset-durability class on the wiki alongside server-owned Postgres slots, consumer-owned external stores (MySQL / MongoDB), and transactional-row storage (Spanner). Oracle and Spanner both live inside the source DB but differ on atomicity with data — Spanner's progress commits transactionally with each row, Oracle's lives in a separate checkpoint table. (3) Precision-aware NUMBER mapping via Oracle's ALL_TAB_COLUMNS data- dictionary view — integers from NUMBER(p, 0)int64, decimals from NUMBER(p, s) with s > 0json.Number. Composed with schema_registry_encode for typed Avro encoding in Schema Registry. Automatic mid-stream schema-drift detection: new columns detected automatically; dropped columns reflected after connector restart. Canonical seventh schema- evolution axis on the wiki (contrast the 2026-03-05 Iceberg- output registry-less axis — this one is registry-with-data- dictionary-as-source-of-truth). (4) Oracle Wallet auth — canonical first wiki instance of file-based credential store. Two wallet formats: cwallet.sso (auto-login, no password) and ewallet.p12 (PKCS#12, password via wallet_password config field which is redacted from logs and config dumps). SSL enabled automatically. Second canonical instance of Bloblang-interpolated multi-table routing — now at the CDC-source-to-topic-per-table position (first instance was the 2026-03-05 Iceberg-output sink-side). topic: ${! meta("table_name").lowercase() }. Competitive framing against Debezium on Kafka Connect verbatim: "No JVM, no Kafka Connect cluster, no separate workers. Just Redpanda Connect doing what it does best." 8 canonical new pages: source + 4 systems (redpanda-connect-oracle-cdc, oracle-database, oracle-logminer, oracle-wallet) + 4 concepts (oracle-logminer-cdc, in-source- cdc-checkpointing, precision-aware-type-mapping, file-based- credential-store). Extends 7 pages: concepts/change-data-capture (sixth engine + fourth offset- durability class), concepts/external-offset-store (fourth row added in comparison table), concepts/schema-evolution (seventh axis), patterns/cdc-driver-ecosystem (ecosystem now six engines), patterns/bloblang-interpolated-multi-table-routing (second instance at CDC-source position), systems/redpanda-connect (new Oracle CDC section + Seen-in), systems/debezium (named competitive foil). Tier-3 borderline include on vocabulary-canonicalisation grounds — fills gaps the prior five-engine CDC ingests left open. Zero production numbers (no throughput / latency / snapshot-duration figures; contrast 2025-11-06 MSSQL launch which disclosed ~40 MB/s vs ~14.5 MB/s). Undisclosed: LogMiner operational caveats (supplemental logging, archive-log rate, continuous-mining deprecation, primary-overhead); snapshot-boundary SCN mechanism; checkpoint- table name/schema/write-cadence; Oracle topology scope (RAC, Data Guard, Multitenant, Standard Edition); parallel-snapshot- of-large-table claim absent (vs 2025-03-18 Postgres + MongoDB differentiator); LOB / LONG / XMLTYPE / JSON-column handling; UPDATE/DELETE before-after-image semantics. Cross-source continuity: sixth-engine extension of the 2025-03-18 CDC connectors post + 2025-11-06 25.3 MSSQL launch; auth/compliance-substrate companion to the 2025-05-20 FIPS post and 2026-03-05 Iceberg-output OAuth2 canonicalisation.

  • 2026-04-02 — Supercharging Redpanda Streaming with profile-guided optimizationRedpanda engineering deep-dive (unsigned). Mechanism-level companion to the 2026-03-31 Redpanda 26.1 launch post's one-line PGO disclosure ("Profile-Guided Optimization (PGO) delivers 10-15% efficiency improvement on small message batches"). Unpacks the clang PGO two-phase compilation and LLVM BOLT post-link alternative (systems/llvm-bolt), framed by top-down microarchitecture analysis (TMA) via Linux perf stat --topdown. Measured wins on the canonical small-batch regression benchmark: ~50% p50 latency, up to 47% p999 latency, 15% CPU reactor utilisation reduction. TMA data verbatim: baseline 51% frontend-bound ("definitely on the higher end, even for database or distributed applications") reduced to 37.9% after PGO — with 6 percentage points moving to retiring (useful work) and 7 to backend-bound ("resolving one bottleneck often reveals the next"). PGO mechanisms: hot-cold splitting + basic-block reordering + profile- driven inlining, all targeting [[concepts/instruction-cache- locality|i-cache locality]]. BOLT heatmap visualisation confirms the hot-code-packed layout ("all hot functions are packed tightly at the start of the binary"). PGO vs BOLT trade-off: Redpanda evaluated both and chose PGO citing stability — "PGO is a proven and widely deployed technology ... outstanding BOLT bugs, we decided to stick with PGO." Disclosed LLVM bug llvm-project#169899 as the decisive datum — first wiki-canonical non-Meta BOLT brittleness disclosure, contrasting Meta's fleet-scale success via Strobelight → BOLT + CSSPGO. BOLT performance "similar to PGO. Most of the time, it came in just slightly behind"; combining both adds "another small bump in performance". Substrate: feedback-directed optimisation (FDO) family — canonicalised as umbrella. Instrumented vs sampling profile trade-off canonicalised. Composes with batching-under-saturation to explain the 15%-CPU → 47%-p999 amplification. Tier-3 on-scope decisively — unusual for Redpanda's launch-/ marketing-heavy Tier-3 corpus; genuine engineering deep-dive with microarchitecture rigor, hardware-counter before/after data, and an explicit PGO-vs-BOLT trade-off analysis that discloses a concrete LLVM bug. Cross-source continuity: mechanism-level companion to the 2026-03-31 26.1 launch post's one-line PGO bullet; extends BOLT coverage from Meta's fleet success (2025-03-07 Strobelight post) to the non-Meta brittleness perspective; sibling to patterns/measurement-driven-micro-optimization at the C++ binary-layout altitude (JVM / JDK-Vector-API sibling at the Java-vectorisation altitude). 9 new canonical pages (source + 6 concepts [PGO, LLVM BOLT post-link optimiser, TMA, frontend- vs-backend-bound, hot-cold splitting, instrumented vs sampling profile, i-cache locality, feedback-directed optimisation] + 3 patterns [PGO for frontend-bound application, TMA-guided target selection, feedback-directed optimisation fleet pipeline] + 2 systems [LLVM BOLT, Clang]) + 2 extensions (systems/meta-bolt-binary-optimizer with non-Meta brittleness disclosure; systems/redpanda with new 26.1 PGO section).

  • 2026-03-30 — Under the hood: Redpanda Cloud Topics architecture — architecture deep-dive on Cloud Topics following its GA in Redpanda Streaming 26.1. First detailed public description of the five primitives that make Cloud Topics work: a Cloud Topics Subsystem that batches in-memory across all partitions/topics ("e.g., 0.25 seconds or 4 MB"), an L0 file uploaded as a single PUT to S3/GCS/ADLS, a placeholder batch replicated via Raft to each involved partition's log carrying only the object-storage pointer, a background Reconciler that rewrites L0 files into L1 files (per-partition, offset-sorted, much larger), and a per-partition Last Reconciled Offset watermark routing reads between L0 and L1. Three new canonical concept pages: concepts/placeholder-batch-metadata-in-raft, concepts/l0-l1-file-compaction-for-object-store-streaming, concepts/last-reconciled-offset. Two new canonical pattern pages: patterns/object-store-batched-write-with-raft-metadata, patterns/background-reconciler-for-read-path-optimization. Architectural canonicalisation: the log-as-truth framing, previously applied at agent-interaction altitude (2025-10-28), is now instantiated inside the broker's own storage architecture — the Raft log of pointers is truth, S3 bytes are addressable cache. Caveats: no absolute latency numbers, no net-cost quantification (eliminated cross-AZ cost replaced by PUT cost + Reconciler egress), no Reconciler placement disclosure, no failure-mode discussion, no cache architecture.

  • 2026-03-05 — Introducing Iceberg output for Redpanda ConnectRedpanda unsigned launch post (~1,000 words) announcing the iceberg output connector for Redpanda Connect shipped in v4.80.0 (enterprise-gated). A declarative sink that writes streaming data to Apache Iceberg tables from a YAML pipeline via the Iceberg REST Catalog API. Positioned as the non-Kafka-source companion to the pre-existing broker-native Iceberg Topics feature — fills the gap for HTTP webhooks, Postgres CDC, GCP Pub/Sub, and other non-Kafka sources that need in-stream transformation (PII stripping, flattening, type routing) before landing in the lakehouse. Three architectural canonicalisations: (1) concepts/registry-less-schema-evolution — infers table schema from raw JSON; no Schema Registry required; verbatim "best of both worlds" between chained SMT brittleness and all-string dirty-data tables. Adds sixth axis to concepts/schema-evolution. (2) concepts/data-driven-flushing — flush only when data is present; inverts Kafka-Connect-era timer-driven default. Mitigates the concepts/small-file-problem-on-object-storage and quiet-source compute waste. (3) patterns/bloblang-interpolated-multi-table-routingtable and namespace config fields support Bloblang interpolation ('events_${!this.event_type}'). One pipeline → N tables. Canonical inversion of "configuration hell". Plus one new architectural pattern: patterns/sink-connector-as-complement-to-broker-native-integration — explicit two-shape comparison table against Iceberg Topics ("Zero-ETL convenience vs Integration flexibility") — the two paths are complementary, not competing. REST-catalog integration matrix: Polaris, systems/aws-glue, systems/unity-catalog, systems/google-biglake, Snowflake Open Catalog. OAuth2 token exchange + per-tenant REST catalog isolation at 0.1 vCPU per-pipeline density. Scope limits (v4.80.0): append-only only (upserts on roadmap — material for CDC UPDATE/DELETE); schema-inference mechanism depth undisclosed; no benchmarks; enterprise-gated license. Tier-3 borderline include as lean ingest on vocabulary-canonicalisation grounds — 4 new concepts (registry-less-schema-evolution, data-driven-flushing, small-file-problem-on-object-storage, bloblang) + 2 new patterns (bloblang-interpolated-multi-table-routing, sink-connector-as-complement-to-broker-native-integration) + 2 new systems (redpanda-connect-iceberg-output, apache-polaris stub) fill definitional gaps. 8 canonical new pages: source + 2 systems + 4 concepts + 2 patterns. Extends 7 pages: systems/redpanda-connect (new Iceberg output section), systems/redpanda-iceberg-topics (new sink-connector-complement Seen-in entry), concepts/schema-evolution (sixth axis entry), concepts/iceberg-catalog-rest-sync (REST catalog as sink-connector integration surface Seen-in entry), patterns/streaming-broker-as-lakehouse-bronze-sink (sink-connector-altitude variant Seen-in entry), patterns/broker-native-iceberg-catalog-registration (sink-connector counterpart Seen-in entry), companies/redpanda (this page). No existing-claim contradictions — strictly additive.

  • 2026-02-10 — How to safely deploy agentic AI in the enterpriseTyler Akidau talk-recap (Redpanda CTO, originator of Google Dataflow / Apache Beam) from Dragonfly's Modern Data Infrastructure Summit. Marketing-adjacent reprise of the 2025-10-28 ADP launch framing, ~3.5 months later, aimed at lay enterprise-architect audience. Two load-bearing canonicalisations: (1) D&D alignment framing — human workers hired into lawful-good quadrant; AI agents default to the chaotic column ("at best 'chaotic good' — because you don't know what you don't know"); governance + auditing infrastructure is the mechanism that moves agents leftward toward lawful. (2) Eight-axis enterprise-agent-infrastructure checklist — context building

  • maintenance / context querying / authentication / governance / auditing / replay and validation / routing / multi-agent coordination. Akidau's load-bearing claim: six of eight are streaming problems (context querying + authentication stay outside streaming's remit). Two new canonical patterns: patterns/dynamic-routing-llm-selective-use (use AI where it wins, route to cheaper ML/heuristics otherwise — fraud-detection worked example: ML scans ~99% normal traffic, LLM investigates the ~1% flagged cases) + patterns/multi-agent-streaming-coordination (streaming broker as decoupled coordination substrate for multi-agent systems; inherits decoupled-services + durability + fan-in + fan-out from microservices-over-Kafka lineage). Agent-anatomy-=-streaming-platform-anatomy framing extends concepts/streaming-as-agile-data-platform-backbone to the agent-substrate altitude. Metadata-only-audit-insufficient framing extends patterns/durable-event-log-as-agent-audit-envelope — classical systems audit logs byte-count + timestamp metadata, but agents require full-input + full-output capture to make inferences. Closing honest caveat: "streaming can help solve a lot of agentic AI challenges, it's not your answer for everything. You still need authN/authZ, a multi-modal catalog of contextual data (not just streaming data), querying, and a durable execution for workflows". Tier-3 borderline include on rhetorical- framing + eight-axis-enumeration + two-new-patterns grounds. 5 canonical new pages: source + 2 concepts + 2 patterns. Extends 8 pages: concepts/autonomy-enterprise-agents + concepts/streaming-as-agile-data-platform-backbone + concepts/governed-agent-data-access + patterns/durable-event-log-as-agent-audit-envelope + patterns/cdc-fanout-single-stream-to-many-consumers + patterns/snapshot-replay-agent-evaluation + concepts/audit-trail + systems/redpanda-agentic-data-plane. Cross-source continuity: talk-recap companion to the 2025-10-28 ADP launch pair ( Gallego productization + governance-pattern naming); sibling to 2025-06-24 streaming-backbone essay (data-substrate half; this Akidau post extends to agent-substrate half); risk-side dual of Gallego autonomy essay (capability side). First wiki footprint for Akidau as a Redpanda-era talk speaker (prior Akidau work on the wiki is via Dataflow / Beam / MillWheel streaming-model primitives).
  • 2026-01-27 — Engineering Den: Query manager implementation demoFirst post in Redpanda's new Engineering Den series; ~600- word post-acquisition disclosure from the Oxla team on their rewrite of the query manager"the component responsible for the lifecycle of currently-running queries". Old manager suffered from ambiguous state (queries stuck in finished or executing while still holding resources; different parts of the system disagreed about what was happening) and a pathological cancellation path — canonicalised verbatim as async- cancellation-thread-spawn anti-pattern: "To avoid deadlocks, the old code gathered running queries, spawned async work per thread, and sometimes had to retry cancellation from a different thread entirely." Rebuilt as a deterministic state machine with every transition logged and explicit teardown at terminal states. Verbatim core claim: "The new scheduler is built as a deterministic state machine. At any point, it's in a known state, handling a specific event, and transitioning predictably. Every transition is logged." Composed pattern canonicalised as patterns/state-machine-as-query-lifecycle-manager. Tested on ~25,000 queries across 1- and 3-node clusters without reproducing the prior pathologies; no throughput / latency numbers — reliability-first validation frame. Debuggability payoff verbatim: "Bugs still happened ... but they were much easier to track down. Being able to trace state transitions made fixes straightforward instead of exploratory." — issues "fixed in days instead of weeks". Production rollout "within days" of post. 6 new canonical wiki pages: source + 5 concepts (concepts/deterministic-state-machine-for-lifecycle, concepts/state-transition-logging, concepts/query-lifecycle-manager, concepts/async-cancellation-thread-spawn-antipattern, concepts/explicit-teardown-on-completion) + 1 pattern (patterns/state-machine-as-query-lifecycle-manager). Extends systems/oxla with first post-acquisition mechanism disclosure (prior Oxla canonicalisation was acquisition-framing from 2025-10-28 ADP launch). Tier-3 borderline include on first-post-acquisition-Oxla-internals-disclosure grounds + reliability-doctrine-canonicalisation grounds — short engineering-diary voice with no state diagram, no code snippets, no benchmark depth. Caveats: "deterministic" claimed not shown (no TLA+ / model-check); cancellation protocol not fully detailed; 25K-query sample is modest (1- and 3-node clusters only); failure-modes of the new manager not enumerated; series kickoff promises more depth in future Den posts. No existing-claim contradictions — strictly additive on Oxla's wiki page. First canonical wiki use of the "state machine as lifecycle manager" pattern at query-engine altitude; related-but-distinct instances already exist at consensus-request altitude (concepts/two-phase-completion-protocol) and workflow altitude (concepts/fault-tolerant-long-running-workflow).
  • 2026-01-13 — The convergence of AI and data streaming, Part 1: The coming brick wallsPeter Corless industry-commentary post (~2,100 words) adapted from the author's AI-by-the-Bay talk. Part 1 of a four-part series; promises Parts 2-4 on adaptive LLM strategies, AI observability/evaluation, and real-time streaming + AI respectively. Names three "brick walls" for frontier AI: (1) ethically-sourced public training data exhaustion (Epoch AI 2024 S-curve thesis; petabyte ceiling vs zettabyte-scale global data production; 180 ZB generated / 200 ZB stored in 2025, CAGR 78%; Yottabyte Era projected 2028-2030); (2) training-cost growth (~260% annually, projected >$1B per frontier model by 2027 per Epoch AI, data- centre energy 2× by 2030 per Nature April 2025); (3) batch-training boundary ("regardless of their dense or MoE architectures, they're still all batch trained"). MoE vs Dense frontier-LLM landscape with concrete parameter- count disclosures: GPT-4 = 8 × 220B (George Hotz 2023 leak), Gemini MoE since 1.5 (Feb 2024), Grok MoE since Grok-1, Anthropic Claude = Dense Transformer holdout. GPT-1 → GPT-5 scaling curve: 117M → ~50T parameters = 5 orders of magnitude in 8 years; GPT-5 400K-token context window. Brick-wall-companion observations: embedding-dimension diminishing returns past 1,536 dims (cites Supabase pgvector); model drift over time verbatim "each answer is a special snowflake, and those snowflakes can melt over time" — cites arXiv 2307.09009 + GPT-5.1 < GPT-5.0 regression on some evals; RLHF as offline batch fine-tuning pipeline (cites arXiv 2307.15217). Names RAG
  • MCP as the two inference-time real-time-data access mechanisms that do not cross the batch-training boundary. Frames the data scientists vs data engineers organisational silo (cites Jesse Anderson's Data Teams) as the socio-technical pre-requisite to architectural convergence. Running gag: the "d20 test" image-generation prompt as a hallucination-failure-mode evaluation opener — only Gemini 3.0 Thinking passed (inconsistently); ChatGPT 5.x, Midjourney, Meta AI, Grok, Claude, Google Veo all fail. Also cites $1.5T global AI spend in 2025. 7 new canonical wiki pages (source + 6 concepts: concepts/frontier-model-batch-training-boundary, concepts/llm-training-data-exhaustion, concepts/llm-model-drift, concepts/dense-transformer, concepts/rlhf-offline-batch, concepts/s-curve-limits, concepts/embedding-dimension-diminishing-returns, concepts/retrieval-augmented-generation). Extends concepts/mixture-of-experts (new Frontier-LLM MoE landscape section with GPT-4 8×220B + Gemini + Grok + Claude disclosures), concepts/llm-hallucination (new Seen-in for d20-test framing + hallucination-orthogonal-to-scaling claim), systems/transformer (new Seen-in with 117M→50T scaling curve + 400K-token context + MoE/Dense variant landscape), plus this page. Tier-3 borderline include on vocabulary-canonicalisation grounds — industry-commentary voice, no production numbers from shipping Redpanda system, streaming-specific payoff explicitly deferred to Parts 2-4. Passes on the frontier-LLM vocabulary (batch-training boundary + data exhaustion + MoE landscape + model drift + RLHF-as-batch) being genuinely missing from prior wiki coverage; canonicalises framing the wiki will compose subsequent ingests (Parts 2-4) against. Companion to 2025-06-24 streaming-backbone essay — the data-substrate framing; this post is the why frontier models need it framing at industry-altitude. Companion to Gallego 2025-04-03 autonomy essay and the 2025-10-28 ADP launch as the agent-substrate framing. Caveats: Hearsay primary sources (Hotz-leak GPT-4 numbers, "estimated" GPT-5 parameter counts); Epoch AI projections are interpretive; embedding-dimension ceiling single-sourced to a Supabase post; arXiv 2307.09009 drift magnitude is contested in the literature; private-data ethics transition narrated not structurally analysed; MoE landscape omits Mixtral / DeepSeek / Qwen / Llama-MoE; RLHF mechanism not walked; d20 test is a conversation-opener gag not a rigorous eval. Series Parts 2-4 deferred.

  • 2025-12-09 — Streaming IoT and event data into Snowflake and ClickHouseUnsigned vendor-tutorial post (~2,400 words) framing a reference IoT pipeline: Redpanda Redpanda Connectboth Snowflake (short-term real-time) and ClickHouse (long-term columnar archive) simultaneously. Marketing voice with heavy Redpanda product-promotion + how-to config examples, but substantial canonical architectural core on the ClickHouse MergeTree + Snowflake Snowpipe Streaming substrate. Canonical new wiki pages (9): source + 7 concepts ( time-partitioned MergeTree + native TTL policies + DETACH PARTITION archival + granule- level min-value skip + append-only tamper resistance + Snowflake MATCH_RECOGNIZE sessionization + hot-cold per-column codec split) + 2 patterns (patterns/time-partitioned-mergetree-for-time-series

  • patterns/clickhouse-plus-snowflake-dual-storage-tier). Inverted storage-tier framing for compliance-sensitive workloads: Snowflake for streaming access logs + financial triggers (governance matters), ClickHouse for long-term compressed retention (compression wins). Canonical MergeTree schema example (telemetry_events with PARTITION BY toYYYYMM(timestamp) + TTL INTERVAL 12 MONTH DELETE + CODEC(ZSTD) on value column). Specific Snowpipe Streaming batching recommendations (500–1,000 records low-latency, 10,000+ bulk, 1,000-at-most for time-series; byte_size: 0; period 10–30 s for real-time dashboards vs 1–5 min for less frequent). schema_evolution off-as-performance-optimisation framing inverts the default "always turn on" recommendation — canonicalised on concepts/schema-evolution as the fifth evolution axis. MATCH_RECOGNIZE worked example for ≤ 10-second same-IP click sessionization. Redpanda Connect gap disclosure: no dedicated ClickHouse output connector; use generic sql_raw / sql_insert processors — contrasts with first-class snowflake_streaming. Broker vs multiplexing named as the two fan-out primitives for the dual-tier pattern. 9 new canonical pages + 8 extensions. Tier-3 borderline include on architectural- density grounds (mergetree internals + codec tiering + MATCH_RECOGNIZE are load-bearing despite marketing voice). Companion to 2025-10-02 Snowpipe Streaming benchmark — that post canonicalised the benchmark; this post canonicalises the batch-tuning guidance and the dual-tier architecture that composes it with ClickHouse.

  • 2025-12-02 — Operationalize Redpanda Connect with GitOpsTutorial-voice unsigned post (~2,000 words) canonicalising the end-to-end Argo CD + Helm + Kustomize deployment shape for Redpanda Connect on Kubernetes. Walks through both deployment modes side by side: Standalone (single pipeline, config baked into Helm values, deployed via Argo CD multi-source Application with chart from charts.redpanda.com pinned at targetRevision: 3.1.0 + values from customer's repo) + Streams (multiple pipelines from Kubernetes ConfigMaps, deployed via Kustomize wrapping the Helm chartconfigMapGenerator for hashed ConfigMap names + helmCharts for chart inflation; kustomize.buildOptions: --enable-helm --load-restrictor LoadRestrictionsNone as Argo CD precondition). Streams-mode REST API (/version, /ready, /streams, /metrics) canonicalises the runtime-API vs GitOps source-of-truth anti-pattern — GitOps-compatible "as long as it's used by automation that derives its desired state from Git", anti-pattern "only when humans or external systems modify pipelines through the API without updating Git." Every production operation expressed as a Git commit: scaling (replicaCount: 1 → 3), adding pipelines (new files in config/), updating pipelines (edit YAML → Kustomize produces new hash → rolling restart via ConfigMap hash rollout), decommissioning (scale to zero or argocd app delete). Observability deployed as parallel Argo CD Application — kube-prometheus-stack (Prometheus + Alertmanager + Grafana + K8s dashboards) + Prometheus service monitor + Redpanda Connect Grafana dashboard — Redpanda Connect exposes Prometheus-compatible metrics natively "without custom exporters or sidecars". Closing product-roadmap laundry list (automatic linting + policy / compliance checks + developer portal + external secrets + template catalog + resource limits + multi-cluster) signals what Redpanda believes a mature Redpanda-Connect GitOps platform needs. Companion GitHub repo: redpanda-data-blog/redpanda-connect-the-gitops-way. 4 canonical new wiki pages: 3 concepts (concepts/standalone-vs-streams-mode, concepts/configmap-hash-rollout, concepts/runtime-api-vs-gitops-source-of-truth) + 2 patterns (patterns/argocd-multi-source-helm-plus-values, patterns/kustomize-wraps-helm-chart) + 1 system (systems/kustomize). Extends 7 pages: systems/redpanda-connect (new GitOps deployment section + frontmatter + Seen-in + Related), systems/argocd (multi-source + Helm+Kustomize + runtime-API-tension sections), concepts/gitops (canonical application-tier Seen-in), systems/helm (Kustomize-composition canonical Seen-in), systems/kubernetes + systems/prometheus + systems/grafana (frontmatter sources). Tier-3 borderline include on vocabulary-canonicalisation grounds — tutorial-voice pedagogy with architecture density ~30-40% concentrated in the standalone/streams comparison table + Argo CD multi-source Application spec + Kustomize-wraps-Helm with --enable-helm precondition + Streams-mode REST API anti-pattern framing. Zero production numbers (no fleet sizes, no latencies, no customer references), no operator-path comparison (the 2025-05-06 K8s guide covers that), no mention of Topic / User CRDs for GitOps-compatible topic provisioning beyond name-check, no external-secrets-manager integration demonstrated. Canonical wiki counterpart to the 2025-05-06 A guide to Redpanda on Kubernetes (Operator path) — this post is the Helm + Argo CD path.

  • 2026-01-06 — Build a real-time lakehouse architecture with Redpanda and DatabricksTech-talk recap post (unsigned, ~1,100 words) summarising the joint Redpanda + Databricks tech talk "From Stream to Table" with speakers Matt Schumpert (Redpanda) + Jason Reed (Databricks, formerly on Netflix's data team). Walks the historical arc Hadoop-era data lakes → governance sprawl → Iceberg (Netflix-originated) → file-based-catalog era → REST catalog standardisation → Redpanda Iceberg Topics → Unity Catalog governance hub. Two load-bearing slogans canonicalise wiki-already-covered primitives at joint-vendor altitude: Schumpert — "The goal of this partnership is to remove the artificial line between real-time data and analytical data."
  • Redpanda unsigned — "the stream is the table" / "Streaming data is analytics-ready by default." Jason Reed supplies the Netflix-origin disclosure + consumer-side corroboration "The data shows up already structured, already governed, and already queryable." Three-system labour division verbatim: "Redpanda delivers real-time performance and reliability at scale. Iceberg provides an open, transactional table format optimized for analytics. Unity Catalog adds governance, optimization, federation, and lifecycle management across the entire system." Unity- Catalog-specific integration disclosure verbatim (Redpanda registers tables, manages schema updates, deletes tables, handles full lifecycle). Zero net-new concepts / patterns / systems — every primitive named is already canonicalised on the wiki (Iceberg + REST catalog + Iceberg topic + Unity Catalog + Bronze-sink pattern + broker-native-catalog- registration pattern all pre-exist). Value is at the joint-vendor-framing + historical-arc + Netflix-origin- disclosure altitudes. 0 new pages, 10 extensions (6 Seen-in additions + 4 frontmatter sources). Tier-3 borderline include on historical-framing + Netflix-origin + joint-vendor grounds; architecture content ~50% of body; zero production numbers.

  • 2025-11-06 — Redpanda 25.3 delivers near-instant disaster recovery and moreRedpanda 25.3 release preview post covering four headline features across three architectural axes. Four load-bearing canonicalisations the wiki had previously gapped: (1) Shadowing"a fully functional, hot-standby clone of your entire Redpanda cluster — topics, configs, consumer group offsets, ACLs, schemas — the works!" — architecturally distinct from both MirrorMaker2 and the prior Redpanda Migrator ("No MirrorMaker 2 or Redpanda Migrator connectors are used under the hood"). Three structural properties: broker- internal (not Kafka Connect-based); offset-preserving (byte-for-byte, with source-identical offsets — removes MM2's offset-translation-map client-failover cost); asynchronous. RPO/RTO in seconds ("limited only by timeout settings for producers and consumers"). Canonical pattern: patterns/offset-preserving-async-cross-region-replication (composed with hot-standby cluster for DR). (2) Cloud Topics (beta) — per-topic storage-substrate choice within a single cluster: record data goes "straight through and written to cost-effective object storage (S3/ADLS/GCS) while topic metadata is managed in-broker — replicated via Raft for high availability". "Virtually eliminates the cross-AZ network traffic associated with data replication" — the feature's load-bearing cost claim, canonicalised as concepts/cross-az-replication-bandwidth-cost. Motivated by the latency-critical vs latency-tolerant workload distinction (payments/trading/cybersecurity vs observability/compliance/model-training). Positioned against Confluent's "Kora-powered … standard/dedicated … Freight … plus separate Confluent WarpStream engine (BYOC)" multi- cluster shape. Canonical pattern: patterns/per-topic-storage-tier-within-one-cluster. (3) Iceberg Topics + Google BigLake metastore — Redpanda 25.3 adds GCP's managed lakehouse catalog to the REST catalog sync axis, completing the set with Unity Catalog / Snowflake Open Catalog (Polaris) / AWS Glue / BigLake. BigQuery now discovers streaming-produced Iceberg tables without CREATE EXTERNAL TABLE DDL; Dataplex provides governance. Complements the prior file-based-catalog shape from the 2025-05-13 BYOC beta post. (4) MSSQL CDC for Redpanda Connect microsoft_sql_server_cdc extends the Redpanda Connect CDC family to five source-database engines (Postgres / MySQL / MongoDB / Spanner / SQL Server). Rides on MSSQL's native change tables. Available in Redpanda Connect 4.67.5 (enterprise). Vendor benchmark: ~40 MB/s ingest + 3:15 initial snapshot on a 5M-row table vs ~14.5 MB/s / 8:04 for an unnamed alternative. Fits CDC driver ecosystem framing. 11 new canonical wiki pages: source + 5 systems (systems/redpanda-shadowing, systems/redpanda-cloud-topics, systems/redpanda-connect-mssql-cdc, systems/microsoft-sql-server, systems/google-biglake)

  • 4 concepts (concepts/offset-preserving-replication, concepts/broker-internal-cross-cluster-replication, concepts/cross-az-replication-bandwidth-cost, concepts/latency-critical-vs-latency-tolerant-workload)
  • 3 patterns (patterns/offset-preserving-async-cross-region-replication, patterns/hot-standby-cluster-for-dr, patterns/per-topic-storage-tier-within-one-cluster). Extends 9 pages: concepts/mirrormaker2-async-replication (new Shadowing-displacement section + Seen-in), concepts/rpo-rto (new seconds-RPO streaming shape section
  • Seen-in), concepts/change-data-capture (MSSQL fifth- engine Seen-in), concepts/iceberg-catalog-rest-sync (BigLake as fourth managed REST catalog), patterns/cdc-driver-ecosystem (MSSQL extension), patterns/tiered-storage-to-object-store (Cloud Topics as per-topic-granularity variant), systems/redpanda (new 25.3 section), systems/redpanda-connect (MSSQL CDC section), systems/redpanda-iceberg-topics (BigLake section), systems/google-bigquery (BigLake-as-REST- catalog-alternative section). Tier-3 borderline include on vocabulary-canonicalisation grounds — launch/announcement voice, zero production numbers beyond the vendor MSSQL benchmark, ambiguous GA/beta status for Shadowing, but four vocabulary primitives genuinely missing from prior wiki coverage (offset-preserving replication, broker-internal cross-cluster replication, cross-AZ replication bandwidth cost, latency-critical vs latency-tolerant workload classification) plus two net-new features-as-systems (Shadowing, Cloud Topics) plus one net-new CDC engine (SQL Server). Architecture content ~50-60% of body. Cross-source continuity: companion to 2025-02-11 HA stretch-clusters (Shadowing extends the Redpanda DR axis from the two-point stretch/MM2 dichotomy to a three-point stretch/Shadowing/MM2 axis); companion to 2025-03-18 CDC connectors (MSSQL extends the Redpanda Connect CDC engine family from four to five); companion to 2025-04-07 Iceberg Topics GA (BigLake extends the REST-catalog axis from three managed catalogs to four). Caveats: launch-voice; Shadowing mechanism under-specified (wire protocol, conflict resolution, DR-drill mechanics, reverse-replication for failback — all elided); Cloud Topics latency profile undisclosed; cross-AZ-cost claim unquantified; MSSQL CDC benchmark alternative unnamed ("alternative hosted Kafka + CDC service"); MSSQL CDC topology scope not enumerated (Always On AG, mirroring, log shipping unstated); BigLake integration mechanism unwalked; Confluent foil comparison doesn't disclose Kora's own tiered storage capabilities; 25.3 release date not given ("coming soon"); unsigned (Redpanda default attribution).

  • 2025-10-28 — Governed autonomy: The path to enterprise Agentic AICompanion governance-framing post published the same day as Gallego's Introducing the Agentic Data Plane launch; unsigned, shorter (~850 words), marketing-voice restatement of the ADP vision focused on the governance substrate. Two canonical new wiki patterns filling governance- pattern-name gaps the 2025-10-28 launch-post sibling left implicit: (1) Agentic Access Control (AAC) — verbatim: "ADP embeds Agentic Access Control (AAC), an evolution of modern access control concepts tailored to the needs of an agentic workforce. Agents never hold long-lived credentials. Every prompt, action, and output is auditable, replayable, and policy-checked before and after I/O, empowering enterprises to grant AI agents fine-grained, temporary access to sensitive data without losing oversight." Three load-bearing properties: no-long-lived-credentials + per-call-policy-before-and-after-I/O + fine-grained-temporary- access. Composition of three pre-canonicalised substrates (concepts/short-lived-credential-auth, concepts/audit-trail, per-call policy enforcement) specialised for the agent audience. Complements the pre-wired OBO authorization pattern — OBO is the who-is-the-caller mechanism; AAC is the what-policy-applies-to-the-call mechanism. (2) Durable event log as agent audit envelope — verbatim: "The ADP treats every agent interaction as a first-class durable event: prompts, inputs, context retrieval, tool calls, outputs, and actions are captured for analysis, compliance, and replay." Six event classes named (prompt + input + context retrieval + tool call + output + action); one log with N views (audit + lineage + replay + SLO + tracing). Applies log-as- truth at the agent-interaction altitude. A2A protocol first-named alongside MCP as open standards (not unpacked). 3 new canonical wiki pages: source + 2 patterns (AAC + durable-event-log-as-envelope). Extends 10 pages: systems/redpanda-agentic-data-plane (re-sourced as dual-sourced from both 10-28 posts with companion-pair framing), systems/oxla (dual-sourced), systems/redpanda

  • systems/redpanda-connect + systems/redpanda-byoc + systems/redpanda-agents-sdk + systems/model-context-protocol (frontmatter sources), concepts/autonomy-enterprise-agents + concepts/governed-agent-data-access + concepts/data-plane-atomicity + concepts/digital-sovereignty + concepts/short-lived-credential-auth + concepts/audit-trail + concepts/data-lineage + concepts/log-as-truth-database-as-cache + patterns/mcp-as-centralized-integration-proxy (all with new Seen-in entries canonicalising the governance-altitude framing). Tier-3 borderline include on vocabulary- canonicalisation grounds — architecture density ~30% on short body; passes because AAC + event-log-as-audit-envelope
  • ADP + Oxla are vocabulary gaps the pre-wired sibling post didn't fully close. Caveats: zero AAC mechanism depth (no IdP / token-exchange / policy-engine); audit + lineage conflated as "unified audit and lineage envelope" at vision altitude; exactly-once-across-tool-chains asserted without mechanism; replay-for-compliance silent on LLM non-determinism; no byline. Cross-source continuity: dual-post launch pair with Introducing the Agentic Data Plane (Gallego-signed founder-voice productization + Oxla acquisition + four- layer composition + three-shift narrative + OBO-IdP) — together the two posts bracket ADP's canonical wiki definition from architecture + acquisition disclosure (Gallego post) to governance-pattern-naming + audit-envelope architectural claim (this post).

  • 2025-10-28 — Introducing the Agentic Data PlaneFounder-voice productization follow-up to Gallego's 2025-04-03 autonomy essay. Names the commercial shape of enterprise autonomy as the Agentic Data Plane (ADP)"a unified runtime and control plane that safely exposes enterprise data to AI agents" composing four layers: (A) streaming (existing Redpanda broker for HITL + durable model replay + observability); (B) query engine — newly- acquired Oxla, a C++ distributed query engine with PostgreSQL wire protocol + separated compute- storage + Iceberg-native (early preview mid-December 2025); (C) systems/redpanda-connect|300+ connectors rebadged as ADP integration layer; (D) net-new global policy + observability layer. Governance-first framing inverts typical agent marketing verbatim: "The fear from CIOs is not the code of the agent itself, it is governance. In simple terms, it is access controls: can I trust that data is accessed by the right things? And observability: when things go wrong, can I understand what happened?" — canonicalised as concepts/governed-agent-data-access (two-axis design surface). First shipped governance feature: Remote MCP + authentication + authorization for OBO (on-behalf-of) workloads with IdP integration — canonicalised as patterns/on-behalf-of-agent-authorization. Structural foil verbatim: "the new digital workforce often interacts with systems created in the API era of root-token permissions, with all-or-nothing as the norm." Three- shift architectural narrative: compute-storage separationlakehouse → agentic data plane. Open-protocols commitment: MCP, A2A, PostgreSQL wire, durable log (Kafka), Iceberg. Things shipped: Remote MCP + OBO, knowledge-based agent templates (Git/Jira/GDrive), declarative Agent Runtime, Redpanda Streaming for HITL. Things acquired (rolling integration): Oxla. Things doubled down on: governance (access controls + observability). 5 new canonical wiki pages: source + 2 systems (systems/redpanda-agentic-data-plane, systems/oxla) + 1 concept (concepts/governed-agent-data-access) + 1 pattern (patterns/on-behalf-of-agent-authorization). Extends 6 pages: systems/redpanda (new ## Agentic Data Plane (2025-10-28 productization) section), systems/redpanda-agents-sdk (productization-into-ADP section + ADP as product-tier-above-SDK framing), systems/model-context-protocol (frontmatter + related extended for ADP-era MCP usage with OBO), [[patterns/mcp-as- centralized-integration-proxy]] (frontmatter extended), concepts/autonomy-enterprise-agents (new productization section + ADP-as-commercial-packaging framing), companies/redpanda (this entry). Tier-3 borderline include on vocabulary-canonicalisation grounds — marketing- heavy launch post, zero production numbers, but three wiki- load-bearing canonicalisations (ADP-as-product-shape, Oxla-as-system, governed-agent-data-access concept + OBO pattern). Caveats: launch-marketing voice; Oxla mechanism- depth thin (planner/executor/catalog model undisclosed); OBO disclosed as product-line-item not mechanism (token flow, consent vocabulary, downstream-system integration surface not disclosed); A2A protocol named but not described; no competitive comparison with Databricks Unity AI Gateway / AWS Bedrock Agents / Snowflake Cortex. Gallego-signed ("handcrafted by a hooman. .alex").

  • 2025-10-02 — Real-time analytics at scale: Redpanda and Snowflake StreamingVendor benchmark of a 9-node Redpanda + 12-node Redpanda Connect → single Snowflake table pipeline via the snowflake_streaming output connector. Headline: 3.8 billion 1 KB AVRO messages at 14.5 GB/s, P50 ≈ 2.18 s / P99 ≈ 7.49 s end-to-end — exceeds Snowflake's documented 10 GB/s per-table ceiling by 45%. Disaggregated latency attribution: 86% of the P99 budget (~6.44 s) is in the Snowpipe-Streaming upload / register / commit path, not in Redpanda read or transport. Four canonical tuning insights: (1) AVRO over JSON = ~20% throughput uplift (patterns/binary-format-for-broker-throughput); (2) count-based batch triggers beat byte-size triggers on the hot path because byte-size requires per-message size computation (patterns/count-over-bytesize-batch-trigger); (3) build_paralellism tuned to (cores − small reserve) — 40 on 48-core nodes — as the Snowpipe-Streaming commit-path latency knob (concepts/build-parallelism-for-ingest-serialization); (4) Snowpipe-Streaming channels are the per-table parallelism unit controlled by channel_prefix × max_in_flight with a hard ceiling of 10,000 channels per table — exceeding surfaces as "the Snowpipe API screaming at us" (concepts/snowpipe-streaming-channel). Decisive scaling dimension: intra-node input/output parallelism via the broker primitive — running many parallel pipelines within one Connect process to saturate per-node resources, canonicalised as patterns/intra-node-parallelism-via-input-output-scaling. Control-group (Redpanda → drop sink) ceiling 15.1 GB/s at 8.38 ms P99; Snowflake commit added ~1 min wall-clock and ~7.5 s P99. Public-internet transport; PrivateLink would reduce further. Borderline-case include on architectural-disclosure grounds: real operational numbers (cluster topology, P50/P99 latencies, per-step attribution) and four first-party tuning findings at mechanism depth.

  • 2025-06-24 — Why streaming is the backbone for AI-native data platformsThought-leadership / vision essay originally syndicated to The New Stack, positioning streaming as the "power grid" of an AI-native data platform. Canonicalises four architectural propositions the wiki had referenced implicitly: (1) streaming-as-backbone of an agile data platform (producer / consumer decoupling + dynamic source/sink add + real-time reactivity) — new concept concepts/streaming-as-agile-data-platform-backbone; (2) CDC fan-out from a single stream to many consumers (search, analytics, vector index, reactive agent) with the user_plans downgrade-trigger worked example and explicit WAL-cleanup-strain trade-off — new pattern patterns/cdc-fanout-single-stream-to-many-consumers; (3) Replayability for iterative RAG — long-lived tiered-storage streams let you re-run historical data through different embedding models or chunking strategies without re-extracting from source — new concept concepts/stream-replayability-for-iterative-pipelines; (4) Open table format = freedom to pick the query engine — Iceberg as the escape hatch from warehouse lock-in, with Snowflake + BigQuery sharing the same dataset via Apache Polaris REST catalog without storing data twice. Also canonicalises schema registry as CI/CD / IaC artefact (PR-time validation, code-owned contracts) and discloses OpenTelemetry context propagation via Kafka record headers as the streaming-boundary analogue of HTTP-header propagation — extending systems/opentelemetry from the Fly.io application-RPC framing. Also names stateless transformation at broker ingress (compliance / masking) and the AI data flywheel (usage → insights → product → usage). 3 canonical new wiki pages: concepts/streaming-as-agile-data-platform-backbone, concepts/stream-replayability-for-iterative-pipelines, patterns/cdc-fanout-single-stream-to-many-consumers. Extends 7 pages: concepts/change-data-capture (new Seen-in canonicalising fan-out topology + WAL-cleanup trade-off + user_plans worked example), concepts/schema-registry (new Seen-in canonicalising CI/CD-IaC-artefact framing — registry as API contract between teams, equivalent to HTTP API contract for sync services), systems/opentelemetry (new Seen-in canonicalising Kafka-record-headers carrier for context propagation at the streaming boundary), patterns/streaming-broker-as-lakehouse-bronze-sink (new Seen-in at vision altitude extending the 2025-01 pedagogy altitude and 2025-04-07 GA-release altitudes), patterns/tiered-storage-to-object-store (new Seen-in canonicalising third axis — economic precondition for replayability — beyond prior capacity + decommission-speed framings), systems/apache-iceberg (new Seen-in canonicalising open-format-escape-hatch from warehouse lock-in + Polaris REST catalog), systems/redpanda-iceberg-topics (new Seen-in at backbone altitude). Tier-3 borderline include. Redpanda vendor voice with heavy product-link density (≈30 blog cross-links to own marketing pages), but architecture content is ~50% of ~1,700-word body and the four propositions above are structurally load-bearing vocabulary the wiki did not previously canonicalise (the backbone framing, the fan-out-from-single-CDC-stream framing, the replayability-for-RAG framing, and the schema- registry-as-CI/CD-artefact framing were all gaps). Passes on vocabulary-canonicalisation grounds even with the marketing- adjacent voice. Cross-source continuity: companion to Gallego 2025-04-03 autonomy essay from the same quarter (Gallego = streaming + MCP + Python SDK as agent substrate; this post = streaming + CDC + Iceberg as AI-data-platform substrate — the agent-substrate and data-substrate halves of the same vision, framed for complementary audiences). Companion to sources/2025-01-21-redpanda-implementing-the-medallion-architecture-with-redpanda|2025-01-21 Medallion architecture post at vision altitude vs mechanism altitude. Companion to sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture|2025-03-18 CDC connectors post — that post canonicalises the CDC reader half of the fan-out pattern; this post canonicalises the consumer-fanout half. Caveats recorded: zero production numbers (no fleet sizes, no latency distributions, no before/after quantitative wins between batch-ETL and streaming); qualitative claims only ("much more effective", "saves you from costly reprocessing"); Iceberg vs Snowpipe-Streaming trade-off named but uncompared on cost / ecosystem / governance; CDC-WAL-cleanup nuance name-only ("delaying WAL cleanup" with no slot-management or retention mechanics); AIOps name-drop without mechanism; no cross-vendor comparison (Kafka / Pulsar / Kinesis / Pub/Sub not compared); unsigned (Redpanda default attribution); originally syndicated to The New Stack as "the power grid for AI-native data platforms" — wiki-version ingest uses the canonical redpanda.com URL.
  • 2025-06-21 — Behind the scenes: Redpanda Cloud's response to the GCP outageProduction-incident retrospective on the 2025-06-12 GCP global outage from Redpanda Cloud's perspective. ~3-hour incident window (18:41–21:38 UTC); SEV4 incident closed with no customer impact across hundreds of clusters. Load-bearing disclosures: (1) cell-based architecture as an explicit Redpanda Cloud product principle — single- binary broker + per-customer cluster, "Redpanda Cloud clusters do not externalize their metadata or any other critical services"; (2) butterfly effect named as first-class system-design primitive"GCP's seemingly innocuous automated quota update triggered a butterfly effect that no human could have predicted"; (3) feedback-control- loop-guarded phased rollouts as the change-management discipline — "we try to close our feedback control loops by watching Redpanda metrics as the phased rollout progresses and stopping when user-facing issues are detected"; (4) hedged observability stack — self-hosted data + third-party UI was degraded-but-usable during cascading outage, saving "exponentially bigger cost ramifications" of a vendor failover; (5) SLA substrate decomposition — 99.99% SLA + ≥99.999% SLO decomposes to six concrete choices (replication ≥3, local NVMe primary + async tiered storage, redundant API/Schema Registry/HTTP Proxy, no external critical-path dependencies except PSC, continuous chaos + load testing, feedback-gated phased rollouts); (6) tiered storage as fallback, not primary — elevated GCS PUT error rates did not impact write availability because primary data is on local NVMe; (7) deliberate disk reserve (unused + used-but-reclaimable) absorbs flush backlog during object-store stress. Canonicalises four new patterns: patterns/cell-based-architecture-for-blast-radius-reduction, patterns/preemptive-low-sev-incident-for-potential-impact (19:08 UTC SEV4 declared before customer impact observed), patterns/proactive-customer-outreach-on-elevated-error-rate (20:56 UTC outreach to customers with highest tiered-storage error rates), and patterns/hedged-observability-stack. One affected cluster (staging, us-central-1, lost node + ~2h replacement) — out of hundreds; customer's production cluster unaffected. Closing thoughts draw a CrowdStrike parallel and argue for "increased adoption of control theory in our change management tools" as an industry-wide reliability practice. Tier-3 on-scope — production-incident retrospective (not marketing) with architecture-density ~60% across timeline + substrate decomposition + six-mitigation reliability practice list. Opens the Redpanda incident-retrospective axis on the wiki. Companion to the 2025-04-03 Gallego autonomy essay (which canonicalised the Data Plane Atomicity invariant) by instantiating the cell- based-architecture deployment shape that operationalises it. Caveats: unsigned, vendor-voice, hindsight-bias acknowledged; single-affected-cluster mechanism underspecified ("uncommon interaction between internal infrastructure components"); no quantitative tiered-storage error-rate metrics; third-party dashboarding/alerting vendor + cloud-marketplace vendor both unnamed; disk-reserve sizing policy undisclosed; PSC exception to no-critical-path-dependencies load-bearing but not walked; phased-rollout-with-feedback-control implementation details absent.

  • 2025-06-17 — Introducing multi-language dynamic plugins for Redpanda ConnectLaunch of the dynamic-plugin framework in Redpanda Connect v4.56.0 (Beta, Apache 2.0). Breaks the Go-only, compile-into-the-binary plugin constraint: plugins now run as separate OS subprocesses communicating with the host Redpanda Connect engine over gRPC on a Unix domain socket, with the cross-process protocol "closely mirroring the existing interfaces defined for plugins within Redpanda Connect's core engine, Benthos". Four canonical new wiki pages: systems/redpanda-connect-dynamic-plugins + two concepts (concepts/subprocess-plugin-isolation"plugins run in separate processes, so crashes won't take down the main Redpanda Connect engine"; [[concepts/batch-only-component-for-ipc- amortization]] — "we use batch components exclusively to amortize the cost of cross-process communication" — only BatchInput / BatchProcessor / BatchOutput types are exposed across the gRPC boundary) + two patterns (patterns/grpc-over-unix-socket-language-agnostic-plugin as the architectural shape; [[patterns/compiled-vs-dynamic-plugin- tradeoff]] capturing the explicit "compiled plugins for performance-critical, dynamic plugins for flexibility and language choice" guidance — dynamic plugins are additive, not a replacement for compiled plugins). Language SDKs: Go (type-safe, for existing Redpanda Connect developers) and Python (headline target — opens the streaming substrate to PyTorch / TensorFlow / Hugging Face / LangChain / NumPy / SciPy for real-time ML inference inside the pipeline). Motivating use case in the post: a Python processor plugin running a pre-trained BERT model from Hugging Face for sentiment analysis on streaming customer feedback. Launch is Apache 2.0 — the plugin framework itself is open-source; connectors built on top may carry different licenses (contrast: 2025-03-18 CDC input connectors were Enterprise-gated). Extends systems/redpanda-connect with a new ## Dynamic plugins (2025-06, Beta, Apache 2.0) section. Tier-3 borderline include: launch / marketing voice with "We're excited..." framing, but architecture content is real — core technical disclosure is the subprocess + gRPC + Unix-socket

  • batch-only amortization design. Passes on vocabulary- canonicalisation grounds — four plugin-architecture primitives (subprocess isolation, batch-only IPC amortization, gRPC-over- Unix-socket language-agnostic plugin shape, compiled-vs-dynamic tradeoff) missing from prior wiki coverage. Caveats: Beta stability only (v4.56.0; protocol stability across minor versions not guaranteed); no gRPC .proto published inline (the protocol "closely mirrors" Benthos interfaces but implementors must consult the SDK source); no performance numbers (no throughput delta vs compiled plugins, no cross-process hop p99, no reference-workload benchmarks); no process-lifecycle details (crash recovery, socket cleanup, supervisor shape unspecified); no horizontal-scaling model for CPU-bound plugins (one subprocess per plugin, no pooling). Opens the Redpanda-Connect extensibility-framework axis on the wiki — prior Redpanda Connect coverage focused on the shipped connector catalog (CDC input connectors, MCP-tool surface); this is the first canonicalizing the developer-surface / plugin-architecture axis.
  • 2025-05-20 — Implementing FIPS compliance in RedpandaConfiguration-walkthrough disclosure of broker-level FIPS 140 compliance in self-managed Redpanda clusters on RHEL. Opens the Redpanda security-substrate axis on the wiki. Three load-bearing canonicalisations: (1) OpenSSL 3.0.9 as the FIPS 140-2 validated cryptographic module consumed by both the redpanda broker binary and the rpk CLI, with OpenSSL 3.1.2 (FIPS 140-3 validated) on the late-2025 upgrade roadmap ahead of 140-2 sunset. (2) Three-state fips_mode config dial (disabled / enabled / permissive) distinguishing production (OS-FIPS + broker-FIPS), non-regulated (no FIPS), and development (broker-FIPS-only, non-production) deployment shapes. permissive is explicitly scoped out of compliance claims — entropy sourcing from a non-FIPS OS breaches the boundary even with broker-level controls. (3) Broker-startup fail-fast as the enforcement shape: "Redpanda will log an error and exit if the underlying operating system isn't properly configured." Structurally stronger than the logging-then-enforcement progressive-rollout shape — regulated workloads have no warn-only regime by design. Extends concepts/fips-cryptographic-boundary: the Redpanda instance surfaces at streaming-broker-startup / validated-module altitude where the boundary manifests as a two-package artefact split (redpanda-fips + redpanda-rpk-fips co-installable with base packages) + three-state config dial + startup enforcement gate — a different architectural layer from the GitHub 2025-09-15 PQ-SSH instance where the boundary manifests as a filtered primitive-advertisement list on the SSH wire. Deployment scope at publication (2025-05-20): self-managed RPM / Debian on RHEL only; Redpanda Cloud, Kubernetes deployments, and Redpanda Connect on roadmap — canonical wiki instance of the FIPS boundary being narrower than a product's full deployment surface because validated- module distribution is deployment-shape-specific. Redpanda Ansible Collection accepts enable_fips=true + fips_mode=enabled opt-in variables. Batch-skip override per explicit user full-ingest instruction; raw frontmatter carried ingested: true + skip_reason: batch-skip — zero architecture signals in 7896-char body (pure marketing). Post is short (~1,100 words), configuration-walkthrough voice, but canonicalises three compliance-substrate primitives missing from wiki's prior FIPS coverage (anchored only on the GitHub PQ-SSH rollout). Caveats: no wire-protocol disclosure (which ciphers/KEX/MACs filtered in FIPS mode not enumerated); FIPS 140-3 transition schedule underspecified (no formal NIST 2026-02-22 sunset date); permissive failure surface beyond entropy not enumerated; non-RHEL OS coverage elided; license-gated; no byline; no benchmarks on FIPS-mode overhead.
  • 2025-05-13 — Getting started with Iceberg Topics on Redpanda BYOCBYOC-beta extension of Iceberg Topics five weeks after 25.1 GA on Dedicated, with a GCS + BigQuery worked example. Three new primitives canonicalised: (1) the per-topic mode configuration surfacevalue_schema_id_prefix (Schema-Registry-wire- format producers → typed Iceberg table), value_schema_latest (latest-schema projection), key_value (schema-less BYTES + Kafka metadata); (2) the file-based catalog as a first-class alternative to REST catalog sync for engines (like BigQuery) that read Iceberg via metadata- pointer DDL; (3) the BYOC-data-ownership compound property — customer-owned bucket + broker-projected Iceberg + customer-owned query engine yields "full control of your Iceberg data with zero compromises". Read-side pattern: BigQuery CREATE EXTERNAL TABLE ... format = 'ICEBERG' on a GCS-hosted vN.metadata.json. Adjacent secondary disclosure: Redpanda BYOC doubles partition density per tier in 25.1 via per-partition memory efficiency improvements (Tier 1: 1,000 → 2,000; Tier 5: 22,800 → 45,600), canonicalised as concepts/broker-partition-density. Tutorial altitude with synthetic Protobuf SensorData generator via Redpanda Connect. Tier-3 borderline-on-scope: vendor tutorial, no production numbers, architecture content ~25-30% of body concentrated on the three new primitives + partition-density datum. Passes on vocabulary-canonicalisation grounds (topic-mode configuration, file-based catalog, and BYOC-data-ownership were all gaps in the wiki). Caveats: file-based-catalog mechanism underspecified vs object-store-catalog fallback from GA; partition-density 2× improvement mechanism unexplained; value_schema_id_prefix vs value_schema_latest vs key_value trade-offs elided; DLQ
  • schema-evolution not re-invoked in BYOC context; Protobuf-specific guidance thin; tier dimensions opaque.
  • 2025-05-06 — A guide to Redpanda on Kubernetes — Product-altitude guide to Redpanda's Kubernetes deployment evolution. Three load-bearing architectural claims: (1) Helm vs Redpanda Operator trade-off on five axes — managed upgrades + rollback, dynamic configuration (CRDs vs Helm-values redeploy), advanced health checks + metrics, lifecycle automation, multi-tenancy. Operator is the default recommendation; Helm chart retained for simpler deployments. (2) Two-to-one operator consolidation — Redpanda previously shipped separate operators for its internal Redpanda Cloud fleet and for customer-facing Self-Managed deployments; the 2025 unification merges them into a single operator (patterns/unified-operator-for-cloud-and-self-managed). (3) FluxCD bundling reversal — the customer operator initially bundled FluxCD internally to wrap the Helm chart; canonical wiki instance of the bundled-GitOps- dependency anti-pattern. Fix across three branches: v2.3.x FluxCD optional (spec.chartRef.useFlux) → v2.4.x (Jan 2025) FluxCD disabled by default → v25.1.x FluxCD and Helm-chart wrapping removed entirely. v25.1.x adopts the version-aligned compatibility scheme — operator/chart version matches Redpanda core version with ±1 minor window, retiring the compatibility matrix document. Introduces systems/redpanda-operator as a canonical wiki system and systems/fluxcd as a minimal page. Tier-3 batch-skip override: raw frontmatter carried ingested: true + skip_reason: batch-skip — marketing/tutorial slug pattern; overridden per explicit user full-ingest instruction. Architecture density ~40% on ~1,400-word body. Caveats: product-guide altitude, no production numbers, FluxCD-removal migration path underspecified, deprecation schedule opaque, unified-operator cutover mechanism not disclosed, multi-region K8s limitation (multi-AZ-only) not revisited. Closes a gap in the wiki's Kubernetes-operator corpus by canonicalising two anti-patterns (bundled GitOps, compatibility matrix) that generalise beyond Redpanda.
  • 2025-04-23 — Need for speed: 9 tips to supercharge Redpanda — Omnibus performance-tuning checklist covering nine tips across three dependency layers — infrastructure (NVMe, dedicated hardware with 95% resource budget, no noisy neighbors; enable broker-side write caching when NVMe isn't available), data architecture (partition skew as Amdahl's Law with three-pronged mitigation — sticky partitioner / keyed only when required / high-cardinality keys; don't compress compacted topics; use tiered storage for fast rebalance), and application design (producer batching, consumer fetch tuning matrix with fetch.min.bytes / fetch.max.wait.ms / max.partition.fetch.bytes / max.poll.records, offset-commit cost / save-button analogy / RPO-as-commit-frequency, client-side compression with ZSTD or LZ4 codec choice). Introduces concepts/keyed-partitioner, patterns/high-cardinality-partition-key, and patterns/client-side-compression-over-broker-compression. Tier-3 borderline-on-scope: vendor-blog checklist voice but substantive gap-filling across six previously uncanonicalised primitives. No author byline, no production numbers, no customer case study.

  • 2025-04-07 — Redpanda 25.1: Iceberg Topics now generally availableGA release disclosure for Iceberg Topics across AWS, Azure, and GCP (framed as "first in industry" Kafka-Iceberg streaming solution GA on multiple clouds). Elaborates the 2025-01-21 pedagogy launch with nine disclosed properties beyond the preview framing. Four table-management capabilities: custom hierarchical bucketed partitioning (operator- controllable Iceberg transforms for query-side pruning); built-in dead-letter queues for schema-invalid records (keeps data- quality invariant without dropping batches); full Iceberg- spec-compliant schema evolution (adds/renames/deletes matching the Iceberg spec); automatic snapshot expiry as a broker-owned metadata-GC loop (retires the wiki's prior externalisation-cost caveat for the snapshot-expiry half; small-file compaction ownership remains open). Five catalog-integration capabilities: secure REST catalog sync via OIDC+TLS against Snowflake Open Catalog / Databricks Unity / AWS Glue; transactional writes via Iceberg's commit-protocol serialisation for safe concurrent multi-writer access; automatic table discovery and registration so downstream engines see new Iceberg-configured topics appear without manual CREATE TABLE; built-in object-store catalog fallback for deployments without a REST catalog; tunable workload management knob for the snapshot-vs-live-topic lag ceiling (making the commit-cadence lag floor an explicit operational parameter). Adjacent 25.1 features: native consumer group lag metrics (Prometheus-exposed, replacing a PromQL compute), Protobuf schema normalization in the Schema Registry, SASL/PLAIN authentication, unified Console+cluster identity with fine-grained RBAC, and FluxCD removal for Kubernetes deployments. Tier-3 borderline-on-scope: vendor launch post, but GA feature disclosure is architecturally substantive — retires two prior wiki caveats (snapshot-expiry ownership, Iceberg-spec-schema-evolution path) and canonicalises three new concepts + two new patterns. Architecture density ~40% on ~1,900-word body. Caveats: vendor framing throughout; "first in industry" unqualified; DLQ operational surface under-specified; transactional-write isolation level unstated; tunable workload management knob name / default / range not disclosed.

  • 2025-04-03 — Autonomy is the future of infrastructure — Alex Gallego's (founder/CEO) vision essay marking the $100M Series D + Redpanda Agents SDK preview launch. Frames the 20-year systems trajectory (single-node DB → managed SaaS → streaming/log substrate → Iceberg continuous-computation handshake → agent orchestration). Canonicalises Redpanda's founding premise "the truth is the log" (Kleppmann 2015), the send-model-to-data enterprise-AI thesis, the batch/streaming convergence framing, and the frontier-model + local-GPU-minion hybrid. Centerpiece: canonical founder- voice retrospective statement of Data Plane Atomicity as BYOC's central design tenet — "no deployment should be able to bring down any other deployment, including a control plane outage... No externalized consensus algorithm, secret managers, no external databases, no external offset recording service, or metadata look up as you are trying to write your data durably to disk." Reframes MCP from tool-description format to centralised integration proxy, with dynamic Redpanda Connect pipeline filtering (Bloblang + Starlark) as the future fine-grain-ACL mechanism. Introduces three new Redpanda systems on the wiki: systems/redpanda-byoc, systems/redpanda-agents-sdk, plus extends systems/redpanda-connect. Operational numbers: ~300 connectors, ~10× price-performance for fine-tuned small models, single-GPU inference for Llama3/Gemma3/DeepSeekV3/Phi-4, three-cloud BYOC (AWS/GCP/Azure) preview scope. Tier-3 borderline-on-scope; founder-voice vision essay + product-launch hybrid; architecture density ~50% concentrated on Data Plane Atomicity tenet + MCP-as-proxy reframing + log-as-truth founding premise.
  • 2025-03-18 — sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture|3 powerful connectors for real-time change data capture — Product-altitude tour of Redpanda Connect's four CDC input connectors (postgres_cdc, mysql_cdc, mongodb_cdc, gcp_spanner_cdc), each riding on the source database's native change log: Postgres logical replication + replication slot / MySQL binlog with external offset cache / MongoDB change streams + oplog / Spanner change streams with transactional offset storage and dynamic partition split/merge handling. Canonicalises parallel snapshot of a single large table or collection as the Redpanda differentiator vs stock Debezium: "Debezium (Kafka Connect) does not do this today." Ships the parallel-snapshot capability in the Postgres and MongoDB connectors; MySQL and Spanner connectors don't have it at publication. MySQL CDC topology scope explicitly limited (no GTID, no Group Replication, no multi-source). Second canonical wiki instance of CDC driver ecosystem — from the consumer-side (Redpanda Connect writes drivers against every source database's native CDC API), bracketing the Vitess-VStream emitter-side instance already canonicalised. Tier-3 on-scope on engine-mechanism canonicalisation grounds; architecture density ~60% of a feature-tour post; four new canonical concept pages + one new system page + one sub-concept pattern extension. Enterprise-license-gated in Redpanda Cloud + Self-Managed.
  • 2025-02-11 — High availability deployment: Multi-region stretch clusters — Part four of Redpanda's HA/DR series. Canonicalises the multi-region stretch cluster as the RPO=0 shape (single Redpanda cluster spans regions; per-partition Raft quorum on every write; automatic leader re-election on region loss). Positions it on the consistency-vs-availability axis against MirrorMaker2 async two-cluster replication (non-zero RPO, per-cluster availability). Canonicalises four operator knobs for cross-region cost mitigation: leader pinning (enterprise feature; bias leadership to client-proximal region), acks=1 (producer durability relaxation), follower fetching (KIP-392 closest-replica consume), remote read replica topic (object-storage-backed read-only mirror cluster). Publishes a three-broker Ansible hosts.ini template with region-as-rack (rack=us-west-2, rack=us-east-2, rack=eu-west-2) and an OMB + tc-inter-broker-latency-injection simulation technique for multi-region performance testing without paying cross-region cloud bandwidth. Current limitation: Self-Managed on K8s is multi-AZ only; multi-region stretch is available on VMs / bare metal / cloud compute / Redpanda Cloud.
  • 2025-01-21 — Implementing the Medallion Architecture with Redpanda — pedagogy-altitude explainer on Databricks' three-tier Bronze/Silver/Gold data-lake pattern, positioning Redpanda's Iceberg topics as the mechanism that makes the streaming broker serve as the Bronze layer of a lakehouse without any external ETL (Airflow / Kafka Connect / Redpanda Connect). Canonicalises concepts/medallion-architecture, concepts/data-lakehouse, concepts/iceberg-topic, concepts/open-file-format on the wiki. Names Flink's Iceberg sink connector as the mechanism for real-time Bronze→Silver→Gold transitions (patterns/stream-processor-for-real-time-medallion-transitions). Tier-3 pedagogy altitude; no production numbers; no compaction-ownership / commit-cadence latency numbers.
  • 2024-11-26 — Batch tuning in Redpanda to optimize performance (part 2) — James Kinley's operations-manual companion to part 1. Canonicalises four Prometheus private metrics (vectorized_storage_log_written_bytes, vectorized_storage_log_batches_written, vectorized_scheduler_queue_length, redpanda_cpu_busy_seconds_total) + five PromQL one-liners + the 4 KB NVMe-alignment batch-size floor + the write-caching broker feature + a real customer case study showing p99 128 ms → 17 ms and 2-cluster → 1-cluster consolidation at ~2.2× per-cluster throughput.
  • 2024-11-19 — Batch tuning in Redpanda for optimized performance (part 1) — James Kinley's first-principles explainer on producer-side batching. Canonicalises the fixed-vs-variable request-cost framing, the linger.ms / batch.size / buffer.memory trigger logic, and the seven-factor effective-batch-size framework.
Last updated · 470 distilled / 1,213 read