Redpanda¶
Redpanda (originally Vectorized, rebranded 2021) is a streaming- platform company whose flagship product is a C++ rewrite of a Kafka-API-compatible broker built on the thread-per-core Seastar framework. The company blog (redpanda.com/blog) covers a mix of product announcements, benchmarks, tutorials, and occasional first-principles technical explainers on streaming-broker internals.
Tier classification¶
Tier 3 on the sysdesign-wiki. The blog is a mix of: - Product PR and launch announcements ("Redpanda 24.3 extends...", "Announcing Redpanda Cloud") — skip unless they disclose real architectural content. - Consultative / industry tutorials ("What is real-time data processing", "Create a real-time analytics pipeline") — skip unless they cover distributed-systems internals, scaling trade-offs, production incidents. - First-principles substrate explainers (e.g. this batch-tuning series by James Kinley) — ingest; these are the Kafka-API substrate posts that the wiki's Redpanda + Kafka coverage depends on. - Company-culture / hackathon / sales posts — skip.
Apply the generic tier-3 filter: skip unless the post explicitly covers distributed-systems internals, scaling trade-offs, infrastructure architecture, production incidents, storage / networking / streaming design.
Key systems¶
- systems/redpanda — the streaming broker itself. Kafka-API- compatible, C++ / Seastar / Raft.
- systems/redpanda-shadowing — 25.3 broker-native cross-region DR feature: byte-for-byte, offset-preserving hot-standby clone of a source cluster in a second region. Replaces MirrorMaker2 and the prior Redpanda Migrator for Redpanda-to-Redpanda DR. RPO/RTO in seconds, bounded by client-timeout settings.
- systems/redpanda-cloud-topics — GA in Redpanda Streaming 26.1 (2026-03-30 deep-dive; beta in 25.3 preview) per-topic object-storage-native topic class within a single cluster: metadata via Raft in-broker (through a placeholder batch), data straight to S3 / ADLS / GCS. Write path uses a Cloud Topics Subsystem that batches records across all partitions and topics for a short window ("e.g., 0.25s or 4 MB") into a single L0 file, then a background Reconciler rewrites L0 into per-partition, offset-sorted L1 files optimised for historical reads. Read path branches on a per-partition Last Reconciled Offset. Eliminates cross-AZ replication bandwidth cost for latency-tolerant workloads (observability, compliance, model training). Positioned against Confluent's Kora + WarpStream multi-cluster shape. Canonicalises patterns/object-store-batched-write-with-raft-metadata + patterns/background-reconciler-for-read-path-optimization.
- systems/redpanda-operator — the Kubernetes Operator for Redpanda cluster lifecycle management. As of v25.1.x (May 2025) a single unified operator serving both Redpanda Cloud (internal fleet) and customer Self-Managed deployments (patterns/unified-operator-for-cloud-and-self-managed); canonical wiki instance of a vendor retreating from a bundled-GitOps- dependency (FluxCD) and adopting a version-aligned compatibility scheme (operator version = Redpanda core version).
- systems/redpanda-connect — the ~300-connector
Kafka-Connect-alternative integration layer, open-sourced from
Benthos. Canonical MCP-tool surface as of 2025-04-03 via
rpk connect mcp-server. - systems/redpanda-connect-dynamic-plugins — Beta (2025-06-17, v4.56.0, Apache 2.0) dynamic-plugin framework in Redpanda Connect: plugins run as separate OS subprocesses communicating with the host over gRPC on a Unix domain socket, breaking the previous Go-only, compile-into-the-binary constraint. Go and Python SDKs at launch; canonical wiki instance of patterns/grpc-over-unix-socket-language-agnostic-plugin and the patterns/compiled-vs-dynamic-plugin-tradeoff.
- systems/redpanda-connect-oracle-cdc — sixth-engine
Oracle CDC input (
oracledb_cdc) in Redpanda Connect v4.83.0 (2026-04-09, enterprise-gated). Rides on Oracle LogMiner; canonical wiki instance of in-source checkpointing (fourth offset-durability class), precision-awareNUMBERmapping viaALL_TAB_COLUMNS+ Schema Registry, and Oracle Wallet auth (canonical first wiki instance of file-based credential store). Completes the Redpanda Connect CDC family to six source-database engines (Postgres / MySQL / MongoDB / Spanner / MSSQL / Oracle). Competitive foil: single Go binary vs JVM + Kafka Connect cluster + Debezium. - systems/redpanda-byoc — Bring Your Own Cloud deployment model. Data plane runs inside the customer's VPC; Redpanda operates the control plane. Canonical tenet: Data Plane Atomicity — no runtime dependency on externalised services in the write path.
- systems/redpanda-cloud — Dedicated managed-cluster peer
to BYOC: Redpanda runs both control plane and data plane
inside Redpanda's infrastructure on the customer's chosen
hyperscaler (AWS/GCP/Azure). Canonical wiki instance of
cell-based architecture
at the streaming-broker altitude — each customer gets an
isolated cluster with no external-metadata critical-path
dependencies. 99.99% availability SLA / ≥99.999% measured
SLO on multi-AZ; replication-factor ≥3 enforced; NVMe-
primary + object-storage-tiered secondary; redundant Kafka
API / Schema Registry / HTTP Proxy; feedback-control-loop-
monitored phased rollouts; continuous chaos + load testing;
one customer-elected critical-path exception —
GCP Private Service
Connect (or AWS PrivateLink equivalent). Canonical
production-incident retrospective:
sources/2025-06-20-redpanda-behind-the-scenes-redpanda-clouds-response-to-the-gcp-outage|2025-06-12 GCP-outage retrospective
— fleet of "hundreds of clusters" survived the global GCP
outage with a single materially-affected cluster (staging in
us-central-1, ~2h replacement-node latency). - systems/redpanda-agents-sdk — 2025-04-03 preview toolkit
for enterprise AI agents: Python SDK (durable execution,
OpenTelemetry, Pydantic/OpenAI-agents ergonomics) +
rpk connect mcp-server+rpk connect agent. "Ruby-on-Rails experience for agents." - systems/redpanda-agentic-data-plane — 2025-10-28 productization (Gallego founder-voice announcement) packaging Redpanda streaming + Oxla query engine + systems/redpanda-connect 300+ connectors + a new global governance/observability layer as "a unified runtime and control plane that safely exposes enterprise data to AI agents". First shipped governance feature: Remote MCP + authentication + authorization for OBO workloads with IdP integration (patterns/on-behalf-of-agent-authorization).
- systems/oxla — acquired 2025-10-28: C++ distributed
query engine with PostgreSQL wire
protocol, separated
compute-storage, and Iceberg-native workload targeting.
rpk oxlaCLI integration. Early preview mid-December 2025. Positioned as the SQL-filter-then-model-summarize substrate for agent context management. - systems/redpanda-iceberg-topics — topic-level integration with Apache Iceberg: a single logical entity that is both a Kafka-protocol topic and an Iceberg table. GA in Redpanda 25.1 (2025-04-07) across AWS, Azure, and GCP with nine disclosed GA-grade properties (custom hierarchical bucketed partitioning, built-in DLQ, Iceberg-spec schema evolution, automatic snapshot expiry, REST catalog sync via OIDC+TLS, transactional writes, automatic table discovery, object-store catalog fallback, tunable workload management). Canonical wiki instance of the streaming-broker-as-lakehouse-Bronze-sink + broker-native catalog registration patterns; Bronze tier of a Medallion-architected lakehouse without external ETL.
- systems/openmessaging-benchmark — the open-source
benchmark framework Redpanda uses (with
tc-injected inter-broker latency) for multi-region stretch-cluster performance testing. - systems/redpanda-connect — Redpanda's Kafka-Connect
alternative integration layer, shipping a family of
per-engine CDC input connectors (
postgres_cdc,mysql_cdc,mongodb_cdc,gcp_spanner_cdc) as the flagship source class. Canonical wiki differentiator vs Debezium: parallel snapshot of a single large table. - systems/openssl — the
validated
cryptographic module substrate Redpanda embeds for
broker-level FIPS 140
compliance via the
redpanda-fips+redpanda-rpk-fipspackages (self-managed RPM / Debian only at publication).
Key patterns / concepts¶
- concepts/offset-preserving-replication — new canonical (2025-11-06, Shadowing): cross-cluster replication where the destination holds source-identical per-partition offsets. Removes MM2's per-consumer-group offset-translation map as a DR critical-path dependency.
- concepts/broker-internal-cross-cluster-replication — new canonical (2025-11-06, Shadowing): replication implemented inside the broker rather than via a separate Kafka Connect cluster. The structural property that enables offset preservation.
- concepts/cross-az-replication-bandwidth-cost — new canonical (2025-11-06, Cloud Topics): the per-byte cloud cross-AZ egress cost that multi-AZ Raft replication pays on every acknowledged write; the cost axis Cloud Topics eliminates for latency-tolerant topics by routing data directly to object storage.
- concepts/latency-critical-vs-latency-tolerant-workload — new canonical (2025-11-06, Cloud Topics): workload-class distinction motivating per-topic storage tiering — latency-critical (payments / trading / cybersecurity) vs latency-tolerant (observability / compliance / model training).
- patterns/offset-preserving-async-cross-region-replication — new canonical pattern (2025-11-06, Shadowing): three-property composition (async + offset-preserving + broker-internal) that Shadowing instantiates.
- patterns/hot-standby-cluster-for-dr — new canonical pattern (2025-11-06, Shadowing): the continuously-up, fully-replicated DR secondary pattern Shadowing realises for the streaming-broker substrate.
- patterns/per-topic-storage-tier-within-one-cluster — new canonical pattern (2025-11-06, Cloud Topics): per-topic selectable storage substrate in a single cluster, replacing the prior per-cluster Kafka-ecosystem shape.
- concepts/placeholder-batch-metadata-in-raft — new canonical (2026-03-30, Cloud Topics architecture deep-dive): the metadata-only record replicated through the per-partition Raft log carrying only an object-storage pointer. What preserves Kafka transaction/idempotency semantics while payload bytes live in S3. "The data payload lives in the cloud, but the guarantees live in Redpanda."
- concepts/l0-l1-file-compaction-for-object-store-streaming — new canonical (2026-03-30): the two-tier object- storage file layout — L0 (ingest-optimised, cross-partition, ≤4 MB / 0.25 s) → Reconciler → L1 (read-optimised, per-partition-colocated, offset-sorted, much larger).
- concepts/last-reconciled-offset — new canonical (2026-03-30): per-partition watermark routing reads between L0 and L1. Single-integer comparison collapses the read-routing decision.
- patterns/object-store-batched-write-with-raft-metadata — new canonical pattern (2026-03-30): the write-path shape Cloud Topics instantiates — aggregate in-memory across partitions, single-PUT to object storage, then Raft-replicate a placeholder batch per involved partition, then ack.
- patterns/background-reconciler-for-read-path-optimization — new canonical pattern (2026-03-30): background process continuously rewriting ingest-optimal L0 files into read-optimal L1 files. Read path uses a single per-partition watermark to pick between them. Analogous to LSM-tree compaction at object-storage-file granularity.
- concepts/autonomy-enterprise-agents — Gallego's 2025-04 founder-voice framing of enterprise autonomy: code-in-control of end-to-end flow vs explicit code paths. Productized as ADP 2025-10-28.
- concepts/governed-agent-data-access — new canonical (2025-10-28): two-axis design surface (access controls + observability) as the primary CIO-facing design concern for enterprise agent deployment. Gallego's inversion of typical agent-product marketing: "The fear from CIOs is not the code of the agent itself, it is governance."
- patterns/mcp-as-centralized-integration-proxy — canonical wiki pattern for MCP as intent-based integration proxy in front of enterprise data systems. Canonicalised 2025-04; extended 2025-10-28 with governance enforcement at the proxy tier.
- patterns/on-behalf-of-agent-authorization — new canonical (2025-10-28): MCP server proxies tool calls carrying the caller's identity + scoped consent, not a shared agent-service token. Fixes "the API era of root-token permissions, with all-or-nothing as the norm" failure mode.
- concepts/iceberg-topic — the broker-native streaming-to- Iceberg primitive underlying Redpanda Iceberg Topics. GA in Redpanda 25.1 across AWS/Azure/GCP.
- concepts/iceberg-catalog-rest-sync — OIDC+TLS sync to Iceberg REST catalogs (Snowflake Open Catalog, Databricks Unity, AWS Glue) canonicalised by the Redpanda 25.1 GA release.
- concepts/iceberg-snapshot-expiry — automatic metadata-GC loop internalised as a broker-owned feature at GA.
- concepts/kafka-consumer-lag-metric — native Prometheus- exposed consumer group lag metric in 25.1, replacing a previously documented PromQL compute.
- patterns/streaming-broker-as-lakehouse-bronze-sink — the canonical architectural pattern Iceberg Topics instantiate.
- patterns/broker-native-iceberg-catalog-registration — the catalog-registration pattern canonicalised at GA; the mechanism behind the "zero-ETL" streaming-to-lakehouse integration framing.
- patterns/dead-letter-queue-for-invalid-records — broker- level validation + DLQ redirect for schema-invalid records; built-in on Iceberg Topics at GA.
- patterns/batch-over-network-to-broker — the canonical Kafka-API producer batching pattern. Redpanda shares the implementation via Kafka-client libraries.
- concepts/effective-batch-size — the seven-factor framework for effective-batch-size in production, canonicalised on the wiki from Redpanda's first-principles batch-tuning explainer.
- concepts/sticky-partitioner — Kafka-client partitioner behaviour.
- concepts/fixed-vs-variable-request-cost — the substrate economics behind batching.
- concepts/batching-latency-tradeoff — the normal-vs-saturated regime batch-latency-throughput trade-off.
- concepts/producer-backpressure-batch-growth — why saturated brokers cause bigger producer batches.
- concepts/broker-effective-batch-size-observability — the four Prometheus private metrics + PromQL cookbook for broker-side effective-batch-size measurement (canonicalised from the part-2 operations-manual explainer).
- concepts/small-batch-nvme-write-amplification — why the target effective-batch-size floor is 4 KB (NVMe page alignment).
- concepts/broker-write-caching — Redpanda's broker-side ack-on-memory + background-flush feature, durability-equivalent to legacy Kafka. The escape hatch when client-side tuning is unavailable.
- concepts/per-topic-batch-diagnosis — aggregate cluster metrics hide per-topic tiny-batch offenders.
- patterns/prometheus-effective-batch-size-dashboard — the five-PromQL-query Grafana dashboard for effective-batch-size operations.
- patterns/iterative-linger-tuning-production-case — the three-round linger-tuning playbook, canonicalised from a real Redpanda Cloud BYOC customer case study.
- patterns/broker-write-caching-as-client-tuning-substitute — when to enable write caching instead of tuning producers.
- concepts/multi-region-stretch-cluster — the single-cluster- across-regions HA shape Redpanda canonicalises via per-partition Raft quorum. RPO=0 against region loss.
- concepts/leader-pinning — enterprise-feature write-path locality dial on stretch clusters; pin leadership to client-proximal region.
- concepts/follower-fetching — consumer-side read-path locality via KIP-392 (rack-aware consumer).
- concepts/remote-read-replica-topic — object-storage-backed read-only mirror on a separate cluster, scaling read fan-out without loading origin brokers.
- concepts/mirrormaker2-async-replication — the async two-cluster alternative to stretch-cluster replication; non-zero RPO.
- concepts/cross-region-bandwidth-cost — the per-byte cross-region cloud egress hazard on stretch-cluster replication.
- patterns/multi-region-raft-quorum — the canonical pattern name for Raft-across-regions synchronous replication.
- patterns/client-proximal-leader-pinning — the pattern leader pinning realises.
- patterns/closest-replica-consume — the pattern follower fetching realises.
- patterns/tc-latency-injection-for-geo-simulation — the simulation technique Redpanda uses for stretch-cluster benchmarks.
- concepts/partition-skew-data-skew — Amdahl's Law for streaming; the canonical framing for why more partitions don't help when a single key dominates.
- concepts/keyed-partitioner — the hash-based partitioner that preserves per-key ordering; use only when strictly required (CDC / per-entity ordering).
- patterns/high-cardinality-partition-key — the operational pattern for keyed partitioning that avoids skew.
- concepts/consumer-fetch-tuning — the four-parameter
consumer-side throughput-vs-latency tuning axis
(
fetch.min.bytes,fetch.max.wait.ms,max.partition.fetch.bytes,max.poll.records). - concepts/offset-commit-cost — save-button analogy;
commits transform consume workload into consume + produce;
auto.commit.interval.ms ≥ 1 srule; commit-frequency-as-RPO link. - concepts/compression-codec-tradeoff — ZSTD / LZ4 as the canonical sweet-spot codecs; compression ratio scales with batch size.
- patterns/client-side-compression-over-broker-compression
— compress on the client, not the broker;
compression.type= producertopic config preserves the opaque-byte invariant. - concepts/compression-compaction-cpu-cost — the broker must decompress + recompress on every compaction pass when compacted topics are compressed; prefer LZ4 over ZSTD in that case.
- concepts/tiered-storage-fast-decommission — orders-of- magnitude faster decommission / recommission because the cold bulk of a broker's partitions already lives in object storage.
- concepts/fips-cryptographic-boundary — the compliance primitive Redpanda instantiates at the broker-startup / validated-module altitude as of 2025-05-20.
- concepts/fips-140-validated-cryptographic-module — the
substrate-altitude primitive Redpanda embodies via
OpenSSL 3.0.9 (FIPS 140-2) / 3.1.2 (FIPS
140-3) validated modules shipped in
redpanda-fips+redpanda-rpk-fipspackages. - concepts/fips-mode-tri-state — the
disabled/enabled/permissivefips_modeconfig dial that gates entry to the boundary. - patterns/startup-enforcement-fail-fast-on-config-noncompliance — "Redpanda will log an error and exit if the underlying operating system isn't properly configured." The FIPS enforcement shape; no silent downgrade.
- concepts/cell-based-architecture — canonical streaming- broker instance of the cells-with-no-external-metadata critical-path reliability pattern; Redpanda's 2025-06-21 retrospective cites AWS Well-Architected as the named pattern and positions the absent-externalisation property against "other products boasting centralized metadata and a diskless architecture [that] likely experienced the full weight of this global outage."
- concepts/butterfly-effect-in-complex-systems — the non- linearity premise the 2025-06-21 retrospective uses to frame the 2025-06-12 GCP outage: "complex systems are characterized by their non-linear nature, which means that observed changes in an output are not proportional to the change in the input."
- concepts/systems-thinking-for-reliability — the closing- prescription framing: control theory + six mitigation primitives (phased rollouts, feedback control loops, load shedding, backpressure, randomised retries, incident response) as the engineering substrate the butterfly-effect property demands.
- concepts/static-stability — applied at streaming-broker disk-reserve altitude: "we leave disk space unused and used- but-reclaimable (for caching), which we can reclaim if the situation warrants it" absorbs a tiered-storage-tier outage without spill-over into primary-path degradation.
- concepts/blast-radius — canonical framing for Redpanda Cloud's per-cluster cell-boundary containment.
- concepts/chaos-engineering — named as one of the seven SLA-supporting disciplines in the 2025-06-21 retrospective: "We continuously chaos-test and load-test Redpanda Cloud tiers' configurations."
- concepts/feedback-control-load-balancing — named as the mechanism Redpanda uses during phased rollouts: "As operations are issued … we try to close our feedback control loops by watching Redpanda metrics as the phased rollout progresses and stopping when user-facing issues are detected."
- patterns/preemptive-low-sev-incident-for-potential-impact — Redpanda's 19:08 UTC SEV4 decision on 2025-06-12 under degraded third-party alerting; canonical wiki instance.
- patterns/cell-based-architecture-for-blast-radius-reduction — the runtime-behaviour pattern that cell-based architecture + RF≥3 + AZ-spread produces when a provider-scale outage removes replacement VM capacity for hours.
- patterns/staged-rollout — Redpanda's named phased-rollout discipline with feedback-loop-monitored halt.
Recent articles¶
- 2026-04-14 — Openclaw
is not for enterprise scale — Redpanda unsigned
rhetorical-voice governance essay (~1,200 words) arguing
that Claude-Code-class local coding agents ("Openclaw"
category stand-in) work for personal dev laptops but fail
at enterprise scale because the sandbox doesn't solve the
underlying credential-holding, audit, and egress-control
problems. Opens with a HackerNews comment re-framing the
sandbox-for-agents problem as "giving your dog a stack of
important documents, then being worried he might eat them,
so you put the dog in a crate, together with the
documents" — a memorable framing the post carries through
as its architectural thesis. Load-bearing canonicalisation:
the closing formula
Gateway + Audit trail + Token vault + Sandboxed compute
= Agents in production as the minimum architectural
bar for enterprise agent deployment. Each component solves a
failure mode the others can't:
(1) Gateway (
central proxy choke point) — single choke point for agent
egress, observability, rate limits, kill switch — "turn it
off for a single service or set of services for your entire
digital workforce at once".
(2) Audit log + transcripts — "why and how, not just
what", with "inputs, outputs, tool calls, token usage, and
the agent's reasoning chain" captured; adds agentic
performance review as a new use case for
the
durable event log audit envelope.
(3) Token vault (new canonical
concept) — out-of-band credential broker that mints
short-lived scoped tokens per operation. The agent never
holds the credentials; "Don't give the dog your keys."
Canonical OBO substrate for user-auth-only systems
(Salesforce, ServiceNow) — "You can't build a real
multi-tenant agent without this."
(4)
Sandboxed compute with gateway-only egress (new canonical
pattern) — sandboxes are "right" (LLMs need Unix
composability for tool-output post-processing) provided
egress is choke-pointed at the gateway and auth comes from
out-of-band agent-identity metadata, not files inside the
sandbox. Redpanda-specific mechanism:
agi CLI (new canonical
system) — "agentic gateway interface", a dynamic
self-describing CLI inside the sandbox that mediates
agent→gateway calls while preserving Unix-workflow
composability. "Yes, the name is a play on that AGI."
First wiki mention; demonstration-altitude, not shipping
product.
Threat-model-at-scale argument: "If you're a developer
running it on a dedicated machine with limited access and
scope, the threat model is manageable [...] The problem
shows up when organizations try to scale that model. When
the IT team decides 'just run it in a VM' for each
department. When someone decides the sandbox is sufficient
governance for production use. It isn't." Canonicalises
sandbox-adequate-for-personal-use-breaks-at-enterprise-
scale as the structural argument for the four-component
stack.
3 new canonical pages: concepts/token-vault +
patterns/four-component-agent-production-stack +
patterns/agent-sandbox-with-gateway-only-egress +
systems/redpanda-agi-cli. Extends 6 pages:
patterns/central-proxy-choke-point (kill-switch added as
canonical choke-point capability; agent-workforce-scale
instance added);
patterns/agentic-access-control ("Don't give the dog
your keys" framing + token-vault substrate reinforcement);
patterns/on-behalf-of-agent-authorization (token-vault
named as OBO substrate for user-auth-only systems);
patterns/durable-event-log-as-agent-audit-envelope
(transcripts + A/B agent evaluation as new use cases);
concepts/audit-trail (transcripts + reasoning-chain as
why-and-how audit shape);
concepts/short-lived-credential-auth (per-operation
minting canonicalised via token-vault).
Tier-3 borderline include on pattern-crystallisation +
new-system grounds — zero production numbers, zero
mechanism depth on the four components, but crystallises
prior governance patterns into a quotable architectural
formula and introduces the agi CLI as a distinct system.
Cross-source continuity: sequel to
2025-10-28 ADP launch +
companion governance-framing post; safety-side companion
to
2025-04-03 Gallego autonomy essay; sibling to
2026-02-10 Akidau talk-recap (four-component stack
compresses six of Akidau's eight axes).
Caveats: rhetorical-voice essay not architecture
deep-dive; "Openclaw" is a product-family stand-in (not a
real product,
myclaw.aiis a rhetorical placeholder); token-vault protocol / software not named; agi CLI is a "demonstration", no repo / license / availability; kill-switch trigger UX not walked; sandbox escape + prompt injection explicitly out of scope. -
2026-04-09 — Oracle CDC now available in Redpanda Connect — Redpanda unsigned launch post (~900 words) announcing the
oracledb_cdcinput in Redpanda Connect v4.83.0 (enterprise-gated). Adds Oracle as the sixth source-database engine in Redpanda's per-engine CDC family (Postgres / MySQL / MongoDB / Spanner / MSSQL / Oracle). Four load-bearing architectural disclosures: (1) rides on Oracle LogMiner — the Oracle Enterprise Edition redo-log-mining utility — canonicalised as concepts/oracle-logminer-cdc, sibling to Postgres logical replication / MySQL binlog / MongoDB oplog / Spanner change streams / SQL Server change tables. No additional Oracle licensing required beyond Enterprise Edition. (2) In-source checkpointing — "Restarts resume from a checkpointed position stored in Oracle itself, with no external cache required, no re-snapshot, and no gaps." Fourth canonical offset-durability class on the wiki alongside server-owned Postgres slots, consumer-owned external stores (MySQL / MongoDB), and transactional-row storage (Spanner). Oracle and Spanner both live inside the source DB but differ on atomicity with data — Spanner's progress commits transactionally with each row, Oracle's lives in a separate checkpoint table. (3) Precision-awareNUMBERmapping via Oracle'sALL_TAB_COLUMNSdata- dictionary view — integers fromNUMBER(p, 0)→int64, decimals fromNUMBER(p, s)withs > 0→json.Number. Composed withschema_registry_encodefor typed Avro encoding in Schema Registry. Automatic mid-stream schema-drift detection: new columns detected automatically; dropped columns reflected after connector restart. Canonical seventh schema- evolution axis on the wiki (contrast the 2026-03-05 Iceberg- output registry-less axis — this one is registry-with-data- dictionary-as-source-of-truth). (4) Oracle Wallet auth — canonical first wiki instance of file-based credential store. Two wallet formats:cwallet.sso(auto-login, no password) andewallet.p12(PKCS#12, password viawallet_passwordconfig field which is redacted from logs and config dumps). SSL enabled automatically. Second canonical instance of Bloblang-interpolated multi-table routing — now at the CDC-source-to-topic-per-table position (first instance was the 2026-03-05 Iceberg-output sink-side).topic: ${! meta("table_name").lowercase() }. Competitive framing against Debezium on Kafka Connect verbatim: "No JVM, no Kafka Connect cluster, no separate workers. Just Redpanda Connect doing what it does best." 8 canonical new pages: source + 4 systems (redpanda-connect-oracle-cdc, oracle-database, oracle-logminer, oracle-wallet) + 4 concepts (oracle-logminer-cdc, in-source- cdc-checkpointing, precision-aware-type-mapping, file-based- credential-store). Extends 7 pages: concepts/change-data-capture (sixth engine + fourth offset- durability class), concepts/external-offset-store (fourth row added in comparison table), concepts/schema-evolution (seventh axis), patterns/cdc-driver-ecosystem (ecosystem now six engines), patterns/bloblang-interpolated-multi-table-routing (second instance at CDC-source position), systems/redpanda-connect (new Oracle CDC section + Seen-in), systems/debezium (named competitive foil). Tier-3 borderline include on vocabulary-canonicalisation grounds — fills gaps the prior five-engine CDC ingests left open. Zero production numbers (no throughput / latency / snapshot-duration figures; contrast 2025-11-06 MSSQL launch which disclosed ~40 MB/s vs ~14.5 MB/s). Undisclosed: LogMiner operational caveats (supplemental logging, archive-log rate, continuous-mining deprecation, primary-overhead); snapshot-boundary SCN mechanism; checkpoint- table name/schema/write-cadence; Oracle topology scope (RAC, Data Guard, Multitenant, Standard Edition); parallel-snapshot- of-large-table claim absent (vs 2025-03-18 Postgres + MongoDB differentiator); LOB /LONG/XMLTYPE/ JSON-column handling; UPDATE/DELETE before-after-image semantics. Cross-source continuity: sixth-engine extension of the 2025-03-18 CDC connectors post + 2025-11-06 25.3 MSSQL launch; auth/compliance-substrate companion to the 2025-05-20 FIPS post and 2026-03-05 Iceberg-output OAuth2 canonicalisation. -
2026-04-02 — Supercharging Redpanda Streaming with profile-guided optimization — Redpanda engineering deep-dive (unsigned). Mechanism-level companion to the 2026-03-31 Redpanda 26.1 launch post's one-line PGO disclosure ("Profile-Guided Optimization (PGO) delivers 10-15% efficiency improvement on small message batches"). Unpacks the clang PGO two-phase compilation and LLVM BOLT post-link alternative (systems/llvm-bolt), framed by top-down microarchitecture analysis (TMA) via Linux
perf stat --topdown. Measured wins on the canonical small-batch regression benchmark: ~50% p50 latency, up to 47% p999 latency, 15% CPU reactor utilisation reduction. TMA data verbatim: baseline 51% frontend-bound ("definitely on the higher end, even for database or distributed applications") reduced to 37.9% after PGO — with 6 percentage points moving to retiring (useful work) and 7 to backend-bound ("resolving one bottleneck often reveals the next"). PGO mechanisms: hot-cold splitting + basic-block reordering + profile- driven inlining, all targeting [[concepts/instruction-cache- locality|i-cache locality]]. BOLT heatmap visualisation confirms the hot-code-packed layout ("all hot functions are packed tightly at the start of the binary"). PGO vs BOLT trade-off: Redpanda evaluated both and chose PGO citing stability — "PGO is a proven and widely deployed technology ... outstanding BOLT bugs, we decided to stick with PGO." Disclosed LLVM bugllvm-project#169899as the decisive datum — first wiki-canonical non-Meta BOLT brittleness disclosure, contrasting Meta's fleet-scale success via Strobelight → BOLT + CSSPGO. BOLT performance "similar to PGO. Most of the time, it came in just slightly behind"; combining both adds "another small bump in performance". Substrate: feedback-directed optimisation (FDO) family — canonicalised as umbrella. Instrumented vs sampling profile trade-off canonicalised. Composes with batching-under-saturation to explain the 15%-CPU → 47%-p999 amplification. Tier-3 on-scope decisively — unusual for Redpanda's launch-/ marketing-heavy Tier-3 corpus; genuine engineering deep-dive with microarchitecture rigor, hardware-counter before/after data, and an explicit PGO-vs-BOLT trade-off analysis that discloses a concrete LLVM bug. Cross-source continuity: mechanism-level companion to the 2026-03-31 26.1 launch post's one-line PGO bullet; extends BOLT coverage from Meta's fleet success (2025-03-07 Strobelight post) to the non-Meta brittleness perspective; sibling to patterns/measurement-driven-micro-optimization at the C++ binary-layout altitude (JVM / JDK-Vector-API sibling at the Java-vectorisation altitude). 9 new canonical pages (source + 6 concepts [PGO, LLVM BOLT post-link optimiser, TMA, frontend- vs-backend-bound, hot-cold splitting, instrumented vs sampling profile, i-cache locality, feedback-directed optimisation] + 3 patterns [PGO for frontend-bound application, TMA-guided target selection, feedback-directed optimisation fleet pipeline] + 2 systems [LLVM BOLT, Clang]) + 2 extensions (systems/meta-bolt-binary-optimizer with non-Meta brittleness disclosure; systems/redpanda with new 26.1 PGO section). -
2026-03-30 — Under the hood: Redpanda Cloud Topics architecture — architecture deep-dive on Cloud Topics following its GA in Redpanda Streaming 26.1. First detailed public description of the five primitives that make Cloud Topics work: a Cloud Topics Subsystem that batches in-memory across all partitions/topics ("e.g., 0.25 seconds or 4 MB"), an L0 file uploaded as a single PUT to S3/GCS/ADLS, a placeholder batch replicated via Raft to each involved partition's log carrying only the object-storage pointer, a background Reconciler that rewrites L0 files into L1 files (per-partition, offset-sorted, much larger), and a per-partition Last Reconciled Offset watermark routing reads between L0 and L1. Three new canonical concept pages: concepts/placeholder-batch-metadata-in-raft, concepts/l0-l1-file-compaction-for-object-store-streaming, concepts/last-reconciled-offset. Two new canonical pattern pages: patterns/object-store-batched-write-with-raft-metadata, patterns/background-reconciler-for-read-path-optimization. Architectural canonicalisation: the log-as-truth framing, previously applied at agent-interaction altitude (2025-10-28), is now instantiated inside the broker's own storage architecture — the Raft log of pointers is truth, S3 bytes are addressable cache. Caveats: no absolute latency numbers, no net-cost quantification (eliminated cross-AZ cost replaced by PUT cost + Reconciler egress), no Reconciler placement disclosure, no failure-mode discussion, no cache architecture.
-
2026-03-05 — Introducing Iceberg output for Redpanda Connect — Redpanda unsigned launch post (~1,000 words) announcing the
icebergoutput connector for Redpanda Connect shipped in v4.80.0 (enterprise-gated). A declarative sink that writes streaming data to Apache Iceberg tables from a YAML pipeline via the Iceberg REST Catalog API. Positioned as the non-Kafka-source companion to the pre-existing broker-native Iceberg Topics feature — fills the gap for HTTP webhooks, Postgres CDC, GCP Pub/Sub, and other non-Kafka sources that need in-stream transformation (PII stripping, flattening, type routing) before landing in the lakehouse. Three architectural canonicalisations: (1) concepts/registry-less-schema-evolution — infers table schema from raw JSON; no Schema Registry required; verbatim "best of both worlds" between chained SMT brittleness and all-string dirty-data tables. Adds sixth axis to concepts/schema-evolution. (2) concepts/data-driven-flushing — flush only when data is present; inverts Kafka-Connect-era timer-driven default. Mitigates the concepts/small-file-problem-on-object-storage and quiet-source compute waste. (3) patterns/bloblang-interpolated-multi-table-routing —tableandnamespaceconfig fields support Bloblang interpolation ('events_${!this.event_type}'). One pipeline → N tables. Canonical inversion of "configuration hell". Plus one new architectural pattern: patterns/sink-connector-as-complement-to-broker-native-integration — explicit two-shape comparison table against Iceberg Topics ("Zero-ETL convenience vs Integration flexibility") — the two paths are complementary, not competing. REST-catalog integration matrix: Polaris, systems/aws-glue, systems/unity-catalog, systems/google-biglake, Snowflake Open Catalog. OAuth2 token exchange + per-tenant REST catalog isolation at 0.1 vCPU per-pipeline density. Scope limits (v4.80.0): append-only only (upserts on roadmap — material for CDC UPDATE/DELETE); schema-inference mechanism depth undisclosed; no benchmarks; enterprise-gated license. Tier-3 borderline include as lean ingest on vocabulary-canonicalisation grounds — 4 new concepts (registry-less-schema-evolution, data-driven-flushing, small-file-problem-on-object-storage, bloblang) + 2 new patterns (bloblang-interpolated-multi-table-routing, sink-connector-as-complement-to-broker-native-integration) + 2 new systems (redpanda-connect-iceberg-output, apache-polaris stub) fill definitional gaps. 8 canonical new pages: source + 2 systems + 4 concepts + 2 patterns. Extends 7 pages: systems/redpanda-connect (new Iceberg output section), systems/redpanda-iceberg-topics (new sink-connector-complement Seen-in entry), concepts/schema-evolution (sixth axis entry), concepts/iceberg-catalog-rest-sync (REST catalog as sink-connector integration surface Seen-in entry), patterns/streaming-broker-as-lakehouse-bronze-sink (sink-connector-altitude variant Seen-in entry), patterns/broker-native-iceberg-catalog-registration (sink-connector counterpart Seen-in entry), companies/redpanda (this page). No existing-claim contradictions — strictly additive. -
2026-02-10 — How to safely deploy agentic AI in the enterprise — Tyler Akidau talk-recap (Redpanda CTO, originator of Google Dataflow / Apache Beam) from Dragonfly's Modern Data Infrastructure Summit. Marketing-adjacent reprise of the 2025-10-28 ADP launch framing, ~3.5 months later, aimed at lay enterprise-architect audience. Two load-bearing canonicalisations: (1) D&D alignment framing — human workers hired into lawful-good quadrant; AI agents default to the chaotic column ("at best 'chaotic good' — because you don't know what you don't know"); governance + auditing infrastructure is the mechanism that moves agents leftward toward lawful. (2) Eight-axis enterprise-agent-infrastructure checklist — context building
- maintenance / context querying / authentication / governance / auditing / replay and validation / routing / multi-agent coordination. Akidau's load-bearing claim: six of eight are streaming problems (context querying + authentication stay outside streaming's remit). Two new canonical patterns: patterns/dynamic-routing-llm-selective-use (use AI where it wins, route to cheaper ML/heuristics otherwise — fraud-detection worked example: ML scans ~99% normal traffic, LLM investigates the ~1% flagged cases) + patterns/multi-agent-streaming-coordination (streaming broker as decoupled coordination substrate for multi-agent systems; inherits decoupled-services + durability + fan-in + fan-out from microservices-over-Kafka lineage). Agent-anatomy-=-streaming-platform-anatomy framing extends concepts/streaming-as-agile-data-platform-backbone to the agent-substrate altitude. Metadata-only-audit-insufficient framing extends patterns/durable-event-log-as-agent-audit-envelope — classical systems audit logs byte-count + timestamp metadata, but agents require full-input + full-output capture to make inferences. Closing honest caveat: "streaming can help solve a lot of agentic AI challenges, it's not your answer for everything. You still need authN/authZ, a multi-modal catalog of contextual data (not just streaming data), querying, and a durable execution for workflows". Tier-3 borderline include on rhetorical- framing + eight-axis-enumeration + two-new-patterns grounds. 5 canonical new pages: source + 2 concepts + 2 patterns. Extends 8 pages: concepts/autonomy-enterprise-agents + concepts/streaming-as-agile-data-platform-backbone + concepts/governed-agent-data-access + patterns/durable-event-log-as-agent-audit-envelope + patterns/cdc-fanout-single-stream-to-many-consumers + patterns/snapshot-replay-agent-evaluation + concepts/audit-trail + systems/redpanda-agentic-data-plane. Cross-source continuity: talk-recap companion to the 2025-10-28 ADP launch pair ( Gallego productization + governance-pattern naming); sibling to 2025-06-24 streaming-backbone essay (data-substrate half; this Akidau post extends to agent-substrate half); risk-side dual of Gallego autonomy essay (capability side). First wiki footprint for Akidau as a Redpanda-era talk speaker (prior Akidau work on the wiki is via Dataflow / Beam / MillWheel streaming-model primitives).
- 2026-01-27 — Engineering Den: Query manager implementation demo
— First post in Redpanda's new Engineering Den series; ~600-
word post-acquisition disclosure from the Oxla
team on their rewrite of the query manager — "the component
responsible for the lifecycle of currently-running queries".
Old manager suffered from ambiguous state (queries stuck in
finishedorexecutingwhile still holding resources; different parts of the system disagreed about what was happening) and a pathological cancellation path — canonicalised verbatim as async- cancellation-thread-spawn anti-pattern: "To avoid deadlocks, the old code gathered running queries, spawned async work per thread, and sometimes had to retry cancellation from a different thread entirely." Rebuilt as a deterministic state machine with every transition logged and explicit teardown at terminal states. Verbatim core claim: "The new scheduler is built as a deterministic state machine. At any point, it's in a known state, handling a specific event, and transitioning predictably. Every transition is logged." Composed pattern canonicalised as patterns/state-machine-as-query-lifecycle-manager. Tested on ~25,000 queries across 1- and 3-node clusters without reproducing the prior pathologies; no throughput / latency numbers — reliability-first validation frame. Debuggability payoff verbatim: "Bugs still happened ... but they were much easier to track down. Being able to trace state transitions made fixes straightforward instead of exploratory." — issues "fixed in days instead of weeks". Production rollout "within days" of post. 6 new canonical wiki pages: source + 5 concepts (concepts/deterministic-state-machine-for-lifecycle, concepts/state-transition-logging, concepts/query-lifecycle-manager, concepts/async-cancellation-thread-spawn-antipattern, concepts/explicit-teardown-on-completion) + 1 pattern (patterns/state-machine-as-query-lifecycle-manager). Extends systems/oxla with first post-acquisition mechanism disclosure (prior Oxla canonicalisation was acquisition-framing from 2025-10-28 ADP launch). Tier-3 borderline include on first-post-acquisition-Oxla-internals-disclosure grounds + reliability-doctrine-canonicalisation grounds — short engineering-diary voice with no state diagram, no code snippets, no benchmark depth. Caveats: "deterministic" claimed not shown (no TLA+ / model-check); cancellation protocol not fully detailed; 25K-query sample is modest (1- and 3-node clusters only); failure-modes of the new manager not enumerated; series kickoff promises more depth in future Den posts. No existing-claim contradictions — strictly additive on Oxla's wiki page. First canonical wiki use of the "state machine as lifecycle manager" pattern at query-engine altitude; related-but-distinct instances already exist at consensus-request altitude (concepts/two-phase-completion-protocol) and workflow altitude (concepts/fault-tolerant-long-running-workflow). - 2026-01-13 — The convergence of AI and data streaming, Part 1: The coming brick walls — Peter Corless industry-commentary post (~2,100 words) adapted from the author's AI-by-the-Bay talk. Part 1 of a four-part series; promises Parts 2-4 on adaptive LLM strategies, AI observability/evaluation, and real-time streaming + AI respectively. Names three "brick walls" for frontier AI: (1) ethically-sourced public training data exhaustion (Epoch AI 2024 S-curve thesis; petabyte ceiling vs zettabyte-scale global data production; 180 ZB generated / 200 ZB stored in 2025, CAGR 78%; Yottabyte Era projected 2028-2030); (2) training-cost growth (~260% annually, projected >$1B per frontier model by 2027 per Epoch AI, data- centre energy 2× by 2030 per Nature April 2025); (3) batch-training boundary ("regardless of their dense or MoE architectures, they're still all batch trained"). MoE vs Dense frontier-LLM landscape with concrete parameter- count disclosures: GPT-4 = 8 × 220B (George Hotz 2023 leak), Gemini MoE since 1.5 (Feb 2024), Grok MoE since Grok-1, Anthropic Claude = Dense Transformer holdout. GPT-1 → GPT-5 scaling curve: 117M → ~50T parameters = 5 orders of magnitude in 8 years; GPT-5 400K-token context window. Brick-wall-companion observations: embedding-dimension diminishing returns past 1,536 dims (cites Supabase pgvector); model drift over time verbatim "each answer is a special snowflake, and those snowflakes can melt over time" — cites arXiv 2307.09009 + GPT-5.1 < GPT-5.0 regression on some evals; RLHF as offline batch fine-tuning pipeline (cites arXiv 2307.15217). Names RAG
-
MCP as the two inference-time real-time-data access mechanisms that do not cross the batch-training boundary. Frames the data scientists vs data engineers organisational silo (cites Jesse Anderson's Data Teams) as the socio-technical pre-requisite to architectural convergence. Running gag: the "d20 test" image-generation prompt as a hallucination-failure-mode evaluation opener — only Gemini 3.0 Thinking passed (inconsistently); ChatGPT 5.x, Midjourney, Meta AI, Grok, Claude, Google Veo all fail. Also cites $1.5T global AI spend in 2025. 7 new canonical wiki pages (source + 6 concepts: concepts/frontier-model-batch-training-boundary, concepts/llm-training-data-exhaustion, concepts/llm-model-drift, concepts/dense-transformer, concepts/rlhf-offline-batch, concepts/s-curve-limits, concepts/embedding-dimension-diminishing-returns, concepts/retrieval-augmented-generation). Extends concepts/mixture-of-experts (new Frontier-LLM MoE landscape section with GPT-4 8×220B + Gemini + Grok + Claude disclosures), concepts/llm-hallucination (new Seen-in for d20-test framing + hallucination-orthogonal-to-scaling claim), systems/transformer (new Seen-in with 117M→50T scaling curve + 400K-token context + MoE/Dense variant landscape), plus this page. Tier-3 borderline include on vocabulary-canonicalisation grounds — industry-commentary voice, no production numbers from shipping Redpanda system, streaming-specific payoff explicitly deferred to Parts 2-4. Passes on the frontier-LLM vocabulary (batch-training boundary + data exhaustion + MoE landscape + model drift + RLHF-as-batch) being genuinely missing from prior wiki coverage; canonicalises framing the wiki will compose subsequent ingests (Parts 2-4) against. Companion to 2025-06-24 streaming-backbone essay — the data-substrate framing; this post is the why frontier models need it framing at industry-altitude. Companion to Gallego 2025-04-03 autonomy essay and the 2025-10-28 ADP launch as the agent-substrate framing. Caveats: Hearsay primary sources (Hotz-leak GPT-4 numbers, "estimated" GPT-5 parameter counts); Epoch AI projections are interpretive; embedding-dimension ceiling single-sourced to a Supabase post; arXiv 2307.09009 drift magnitude is contested in the literature; private-data ethics transition narrated not structurally analysed; MoE landscape omits Mixtral / DeepSeek / Qwen / Llama-MoE; RLHF mechanism not walked; d20 test is a conversation-opener gag not a rigorous eval. Series Parts 2-4 deferred.
-
2025-12-09 — Streaming IoT and event data into Snowflake and ClickHouse — Unsigned vendor-tutorial post (~2,400 words) framing a reference IoT pipeline: Redpanda → Redpanda Connect → both Snowflake (short-term real-time) and ClickHouse (long-term columnar archive) simultaneously. Marketing voice with heavy Redpanda product-promotion + how-to config examples, but substantial canonical architectural core on the ClickHouse MergeTree + Snowflake Snowpipe Streaming substrate. Canonical new wiki pages (9): source + 7 concepts ( time-partitioned MergeTree + native TTL policies +
DETACH PARTITIONarchival + granule- level min-value skip + append-only tamper resistance + SnowflakeMATCH_RECOGNIZEsessionization + hot-cold per-column codec split) + 2 patterns (patterns/time-partitioned-mergetree-for-time-series -
patterns/clickhouse-plus-snowflake-dual-storage-tier). Inverted storage-tier framing for compliance-sensitive workloads: Snowflake for streaming access logs + financial triggers (governance matters), ClickHouse for long-term compressed retention (compression wins). Canonical MergeTree schema example (
telemetry_eventswithPARTITION BY toYYYYMM(timestamp)+TTL INTERVAL 12 MONTH DELETE+CODEC(ZSTD) on valuecolumn). Specific Snowpipe Streaming batching recommendations (500–1,000 records low-latency, 10,000+ bulk, 1,000-at-most for time-series;byte_size: 0;period10–30 s for real-time dashboards vs 1–5 min for less frequent).schema_evolutionoff-as-performance-optimisation framing inverts the default "always turn on" recommendation — canonicalised on concepts/schema-evolution as the fifth evolution axis.MATCH_RECOGNIZEworked example for ≤ 10-second same-IP click sessionization. Redpanda Connect gap disclosure: no dedicated ClickHouse output connector; use genericsql_raw/sql_insertprocessors — contrasts with first-classsnowflake_streaming. Broker vs multiplexing named as the two fan-out primitives for the dual-tier pattern. 9 new canonical pages + 8 extensions. Tier-3 borderline include on architectural- density grounds (mergetree internals + codec tiering +MATCH_RECOGNIZEare load-bearing despite marketing voice). Companion to 2025-10-02 Snowpipe Streaming benchmark — that post canonicalised the benchmark; this post canonicalises the batch-tuning guidance and the dual-tier architecture that composes it with ClickHouse. -
2025-12-02 — Operationalize Redpanda Connect with GitOps — Tutorial-voice unsigned post (~2,000 words) canonicalising the end-to-end Argo CD + Helm + Kustomize deployment shape for Redpanda Connect on Kubernetes. Walks through both deployment modes side by side: Standalone (single pipeline, config baked into Helm values, deployed via Argo CD multi-source Application with chart from
charts.redpanda.compinned attargetRevision: 3.1.0+ values from customer's repo) + Streams (multiple pipelines from Kubernetes ConfigMaps, deployed via Kustomize wrapping the Helm chart —configMapGeneratorfor hashed ConfigMap names +helmChartsfor chart inflation;kustomize.buildOptions: --enable-helm --load-restrictor LoadRestrictionsNoneas Argo CD precondition). Streams-mode REST API (/version,/ready,/streams,/metrics) canonicalises the runtime-API vs GitOps source-of-truth anti-pattern — GitOps-compatible "as long as it's used by automation that derives its desired state from Git", anti-pattern "only when humans or external systems modify pipelines through the API without updating Git." Every production operation expressed as a Git commit: scaling (replicaCount: 1 → 3), adding pipelines (new files inconfig/), updating pipelines (edit YAML → Kustomize produces new hash → rolling restart via ConfigMap hash rollout), decommissioning (scale to zero orargocd app delete). Observability deployed as parallel Argo CD Application —kube-prometheus-stack(Prometheus + Alertmanager + Grafana + K8s dashboards) + Prometheus service monitor + Redpanda Connect Grafana dashboard — Redpanda Connect exposes Prometheus-compatible metrics natively "without custom exporters or sidecars". Closing product-roadmap laundry list (automatic linting + policy / compliance checks + developer portal + external secrets + template catalog + resource limits + multi-cluster) signals what Redpanda believes a mature Redpanda-Connect GitOps platform needs. Companion GitHub repo: redpanda-data-blog/redpanda-connect-the-gitops-way. 4 canonical new wiki pages: 3 concepts (concepts/standalone-vs-streams-mode, concepts/configmap-hash-rollout, concepts/runtime-api-vs-gitops-source-of-truth) + 2 patterns (patterns/argocd-multi-source-helm-plus-values, patterns/kustomize-wraps-helm-chart) + 1 system (systems/kustomize). Extends 7 pages: systems/redpanda-connect (new GitOps deployment section + frontmatter + Seen-in + Related), systems/argocd (multi-source + Helm+Kustomize + runtime-API-tension sections), concepts/gitops (canonical application-tier Seen-in), systems/helm (Kustomize-composition canonical Seen-in), systems/kubernetes + systems/prometheus + systems/grafana (frontmatter sources). Tier-3 borderline include on vocabulary-canonicalisation grounds — tutorial-voice pedagogy with architecture density ~30-40% concentrated in the standalone/streams comparison table + Argo CD multi-source Application spec + Kustomize-wraps-Helm with--enable-helmprecondition + Streams-mode REST API anti-pattern framing. Zero production numbers (no fleet sizes, no latencies, no customer references), no operator-path comparison (the 2025-05-06 K8s guide covers that), no mention ofTopic/UserCRDs for GitOps-compatible topic provisioning beyond name-check, no external-secrets-manager integration demonstrated. Canonical wiki counterpart to the 2025-05-06 A guide to Redpanda on Kubernetes (Operator path) — this post is the Helm + Argo CD path. - 2026-01-06 — Build a real-time lakehouse architecture with Redpanda and Databricks — Tech-talk recap post (unsigned, ~1,100 words) summarising the joint Redpanda + Databricks tech talk "From Stream to Table" with speakers Matt Schumpert (Redpanda) + Jason Reed (Databricks, formerly on Netflix's data team). Walks the historical arc Hadoop-era data lakes → governance sprawl → Iceberg (Netflix-originated) → file-based-catalog era → REST catalog standardisation → Redpanda Iceberg Topics → Unity Catalog governance hub. Two load-bearing slogans canonicalise wiki-already-covered primitives at joint-vendor altitude: Schumpert — "The goal of this partnership is to remove the artificial line between real-time data and analytical data."
-
Redpanda unsigned — "the stream is the table" / "Streaming data is analytics-ready by default." Jason Reed supplies the Netflix-origin disclosure + consumer-side corroboration "The data shows up already structured, already governed, and already queryable." Three-system labour division verbatim: "Redpanda delivers real-time performance and reliability at scale. Iceberg provides an open, transactional table format optimized for analytics. Unity Catalog adds governance, optimization, federation, and lifecycle management across the entire system." Unity- Catalog-specific integration disclosure verbatim (Redpanda registers tables, manages schema updates, deletes tables, handles full lifecycle). Zero net-new concepts / patterns / systems — every primitive named is already canonicalised on the wiki (Iceberg + REST catalog + Iceberg topic + Unity Catalog + Bronze-sink pattern + broker-native-catalog- registration pattern all pre-exist). Value is at the joint-vendor-framing + historical-arc + Netflix-origin- disclosure altitudes. 0 new pages, 10 extensions (6 Seen-in additions + 4 frontmatter sources). Tier-3 borderline include on historical-framing + Netflix-origin + joint-vendor grounds; architecture content ~50% of body; zero production numbers.
-
2025-11-06 — Redpanda 25.3 delivers near-instant disaster recovery and more — Redpanda 25.3 release preview post covering four headline features across three architectural axes. Four load-bearing canonicalisations the wiki had previously gapped: (1) Shadowing — "a fully functional, hot-standby clone of your entire Redpanda cluster — topics, configs, consumer group offsets, ACLs, schemas — the works!" — architecturally distinct from both MirrorMaker2 and the prior Redpanda Migrator ("No MirrorMaker 2 or Redpanda Migrator connectors are used under the hood"). Three structural properties: broker- internal (not Kafka Connect-based); offset-preserving (byte-for-byte, with source-identical offsets — removes MM2's offset-translation-map client-failover cost); asynchronous. RPO/RTO in seconds ("limited only by timeout settings for producers and consumers"). Canonical pattern: patterns/offset-preserving-async-cross-region-replication (composed with hot-standby cluster for DR). (2) Cloud Topics (beta) — per-topic storage-substrate choice within a single cluster: record data goes "straight through and written to cost-effective object storage (S3/ADLS/GCS) while topic metadata is managed in-broker — replicated via Raft for high availability". "Virtually eliminates the cross-AZ network traffic associated with data replication" — the feature's load-bearing cost claim, canonicalised as concepts/cross-az-replication-bandwidth-cost. Motivated by the latency-critical vs latency-tolerant workload distinction (payments/trading/cybersecurity vs observability/compliance/model-training). Positioned against Confluent's "Kora-powered … standard/dedicated … Freight … plus separate Confluent WarpStream engine (BYOC)" multi- cluster shape. Canonical pattern: patterns/per-topic-storage-tier-within-one-cluster. (3) Iceberg Topics + Google BigLake metastore — Redpanda 25.3 adds GCP's managed lakehouse catalog to the REST catalog sync axis, completing the set with Unity Catalog / Snowflake Open Catalog (Polaris) / AWS Glue / BigLake. BigQuery now discovers streaming-produced Iceberg tables without
CREATE EXTERNAL TABLEDDL; Dataplex provides governance. Complements the prior file-based-catalog shape from the 2025-05-13 BYOC beta post. (4) MSSQL CDC for Redpanda Connect —microsoft_sql_server_cdcextends the Redpanda Connect CDC family to five source-database engines (Postgres / MySQL / MongoDB / Spanner / SQL Server). Rides on MSSQL's native change tables. Available in Redpanda Connect 4.67.5 (enterprise). Vendor benchmark: ~40 MB/s ingest + 3:15 initial snapshot on a 5M-row table vs ~14.5 MB/s / 8:04 for an unnamed alternative. Fits CDC driver ecosystem framing. 11 new canonical wiki pages: source + 5 systems (systems/redpanda-shadowing, systems/redpanda-cloud-topics, systems/redpanda-connect-mssql-cdc, systems/microsoft-sql-server, systems/google-biglake) - 4 concepts (concepts/offset-preserving-replication, concepts/broker-internal-cross-cluster-replication, concepts/cross-az-replication-bandwidth-cost, concepts/latency-critical-vs-latency-tolerant-workload)
- 3 patterns (patterns/offset-preserving-async-cross-region-replication, patterns/hot-standby-cluster-for-dr, patterns/per-topic-storage-tier-within-one-cluster). Extends 9 pages: concepts/mirrormaker2-async-replication (new Shadowing-displacement section + Seen-in), concepts/rpo-rto (new seconds-RPO streaming shape section
-
Seen-in), concepts/change-data-capture (MSSQL fifth- engine Seen-in), concepts/iceberg-catalog-rest-sync (BigLake as fourth managed REST catalog), patterns/cdc-driver-ecosystem (MSSQL extension), patterns/tiered-storage-to-object-store (Cloud Topics as per-topic-granularity variant), systems/redpanda (new 25.3 section), systems/redpanda-connect (MSSQL CDC section), systems/redpanda-iceberg-topics (BigLake section), systems/google-bigquery (BigLake-as-REST- catalog-alternative section). Tier-3 borderline include on vocabulary-canonicalisation grounds — launch/announcement voice, zero production numbers beyond the vendor MSSQL benchmark, ambiguous GA/beta status for Shadowing, but four vocabulary primitives genuinely missing from prior wiki coverage (offset-preserving replication, broker-internal cross-cluster replication, cross-AZ replication bandwidth cost, latency-critical vs latency-tolerant workload classification) plus two net-new features-as-systems (Shadowing, Cloud Topics) plus one net-new CDC engine (SQL Server). Architecture content ~50-60% of body. Cross-source continuity: companion to 2025-02-11 HA stretch-clusters (Shadowing extends the Redpanda DR axis from the two-point stretch/MM2 dichotomy to a three-point stretch/Shadowing/MM2 axis); companion to 2025-03-18 CDC connectors (MSSQL extends the Redpanda Connect CDC engine family from four to five); companion to 2025-04-07 Iceberg Topics GA (BigLake extends the REST-catalog axis from three managed catalogs to four). Caveats: launch-voice; Shadowing mechanism under-specified (wire protocol, conflict resolution, DR-drill mechanics, reverse-replication for failback — all elided); Cloud Topics latency profile undisclosed; cross-AZ-cost claim unquantified; MSSQL CDC benchmark alternative unnamed ("alternative hosted Kafka + CDC service"); MSSQL CDC topology scope not enumerated (Always On AG, mirroring, log shipping unstated); BigLake integration mechanism unwalked; Confluent foil comparison doesn't disclose Kora's own tiered storage capabilities; 25.3 release date not given ("coming soon"); unsigned (Redpanda default attribution).
-
2025-10-28 — Governed autonomy: The path to enterprise Agentic AI — Companion governance-framing post published the same day as Gallego's Introducing the Agentic Data Plane launch; unsigned, shorter (~850 words), marketing-voice restatement of the ADP vision focused on the governance substrate. Two canonical new wiki patterns filling governance- pattern-name gaps the 2025-10-28 launch-post sibling left implicit: (1) Agentic Access Control (AAC) — verbatim: "ADP embeds Agentic Access Control (AAC), an evolution of modern access control concepts tailored to the needs of an agentic workforce. Agents never hold long-lived credentials. Every prompt, action, and output is auditable, replayable, and policy-checked before and after I/O, empowering enterprises to grant AI agents fine-grained, temporary access to sensitive data without losing oversight." Three load-bearing properties: no-long-lived-credentials + per-call-policy-before-and-after-I/O + fine-grained-temporary- access. Composition of three pre-canonicalised substrates (concepts/short-lived-credential-auth, concepts/audit-trail, per-call policy enforcement) specialised for the agent audience. Complements the pre-wired OBO authorization pattern — OBO is the who-is-the-caller mechanism; AAC is the what-policy-applies-to-the-call mechanism. (2) Durable event log as agent audit envelope — verbatim: "The ADP treats every agent interaction as a first-class durable event: prompts, inputs, context retrieval, tool calls, outputs, and actions are captured for analysis, compliance, and replay." Six event classes named (prompt + input + context retrieval + tool call + output + action); one log with N views (audit + lineage + replay + SLO + tracing). Applies log-as- truth at the agent-interaction altitude. A2A protocol first-named alongside MCP as open standards (not unpacked). 3 new canonical wiki pages: source + 2 patterns (AAC + durable-event-log-as-envelope). Extends 10 pages: systems/redpanda-agentic-data-plane (re-sourced as dual-sourced from both 10-28 posts with companion-pair framing), systems/oxla (dual-sourced), systems/redpanda
- systems/redpanda-connect + systems/redpanda-byoc + systems/redpanda-agents-sdk + systems/model-context-protocol (frontmatter sources), concepts/autonomy-enterprise-agents + concepts/governed-agent-data-access + concepts/data-plane-atomicity + concepts/digital-sovereignty + concepts/short-lived-credential-auth + concepts/audit-trail + concepts/data-lineage + concepts/log-as-truth-database-as-cache + patterns/mcp-as-centralized-integration-proxy (all with new Seen-in entries canonicalising the governance-altitude framing). Tier-3 borderline include on vocabulary- canonicalisation grounds — architecture density ~30% on short body; passes because AAC + event-log-as-audit-envelope
-
ADP + Oxla are vocabulary gaps the pre-wired sibling post didn't fully close. Caveats: zero AAC mechanism depth (no IdP / token-exchange / policy-engine); audit + lineage conflated as "unified audit and lineage envelope" at vision altitude; exactly-once-across-tool-chains asserted without mechanism; replay-for-compliance silent on LLM non-determinism; no byline. Cross-source continuity: dual-post launch pair with Introducing the Agentic Data Plane (Gallego-signed founder-voice productization + Oxla acquisition + four- layer composition + three-shift narrative + OBO-IdP) — together the two posts bracket ADP's canonical wiki definition from architecture + acquisition disclosure (Gallego post) to governance-pattern-naming + audit-envelope architectural claim (this post).
-
2025-10-28 — Introducing the Agentic Data Plane — Founder-voice productization follow-up to Gallego's 2025-04-03 autonomy essay. Names the commercial shape of enterprise autonomy as the Agentic Data Plane (ADP) — "a unified runtime and control plane that safely exposes enterprise data to AI agents" composing four layers: (A) streaming (existing Redpanda broker for HITL + durable model replay + observability); (B) query engine — newly- acquired Oxla, a C++ distributed query engine with PostgreSQL wire protocol + separated compute- storage + Iceberg-native (early preview mid-December 2025); (C) systems/redpanda-connect|300+ connectors rebadged as ADP integration layer; (D) net-new global policy + observability layer. Governance-first framing inverts typical agent marketing verbatim: "The fear from CIOs is not the code of the agent itself, it is governance. In simple terms, it is access controls: can I trust that data is accessed by the right things? And observability: when things go wrong, can I understand what happened?" — canonicalised as concepts/governed-agent-data-access (two-axis design surface). First shipped governance feature: Remote MCP + authentication + authorization for OBO (on-behalf-of) workloads with IdP integration — canonicalised as patterns/on-behalf-of-agent-authorization. Structural foil verbatim: "the new digital workforce often interacts with systems created in the API era of root-token permissions, with all-or-nothing as the norm." Three- shift architectural narrative: compute-storage separation → lakehouse → agentic data plane. Open-protocols commitment: MCP, A2A, PostgreSQL wire, durable log (Kafka), Iceberg. Things shipped: Remote MCP + OBO, knowledge-based agent templates (Git/Jira/GDrive), declarative Agent Runtime, Redpanda Streaming for HITL. Things acquired (rolling integration): Oxla. Things doubled down on: governance (access controls + observability). 5 new canonical wiki pages: source + 2 systems (systems/redpanda-agentic-data-plane, systems/oxla) + 1 concept (concepts/governed-agent-data-access) + 1 pattern (patterns/on-behalf-of-agent-authorization). Extends 6 pages: systems/redpanda (new
## Agentic Data Plane (2025-10-28 productization)section), systems/redpanda-agents-sdk (productization-into-ADP section + ADP as product-tier-above-SDK framing), systems/model-context-protocol (frontmatter + related extended for ADP-era MCP usage with OBO), [[patterns/mcp-as- centralized-integration-proxy]] (frontmatter extended), concepts/autonomy-enterprise-agents (new productization section + ADP-as-commercial-packaging framing), companies/redpanda (this entry). Tier-3 borderline include on vocabulary-canonicalisation grounds — marketing- heavy launch post, zero production numbers, but three wiki- load-bearing canonicalisations (ADP-as-product-shape, Oxla-as-system, governed-agent-data-access concept + OBO pattern). Caveats: launch-marketing voice; Oxla mechanism- depth thin (planner/executor/catalog model undisclosed); OBO disclosed as product-line-item not mechanism (token flow, consent vocabulary, downstream-system integration surface not disclosed); A2A protocol named but not described; no competitive comparison with Databricks Unity AI Gateway / AWS Bedrock Agents / Snowflake Cortex. Gallego-signed ("handcrafted by a hooman. .alex"). -
2025-10-02 — Real-time analytics at scale: Redpanda and Snowflake Streaming — Vendor benchmark of a 9-node Redpanda + 12-node Redpanda Connect → single Snowflake table pipeline via the
snowflake_streamingoutput connector. Headline: 3.8 billion 1 KB AVRO messages at 14.5 GB/s, P50 ≈ 2.18 s / P99 ≈ 7.49 s end-to-end — exceeds Snowflake's documented 10 GB/s per-table ceiling by 45%. Disaggregated latency attribution: 86% of the P99 budget (~6.44 s) is in the Snowpipe-Streaming upload / register / commit path, not in Redpanda read or transport. Four canonical tuning insights: (1) AVRO over JSON = ~20% throughput uplift (patterns/binary-format-for-broker-throughput); (2) count-based batch triggers beat byte-size triggers on the hot path because byte-size requires per-message size computation (patterns/count-over-bytesize-batch-trigger); (3)build_paralellismtuned to (cores − small reserve) — 40 on 48-core nodes — as the Snowpipe-Streaming commit-path latency knob (concepts/build-parallelism-for-ingest-serialization); (4) Snowpipe-Streaming channels are the per-table parallelism unit controlled bychannel_prefix×max_in_flightwith a hard ceiling of 10,000 channels per table — exceeding surfaces as "the Snowpipe API screaming at us" (concepts/snowpipe-streaming-channel). Decisive scaling dimension: intra-node input/output parallelism via thebrokerprimitive — running many parallel pipelines within one Connect process to saturate per-node resources, canonicalised as patterns/intra-node-parallelism-via-input-output-scaling. Control-group (Redpanda →dropsink) ceiling 15.1 GB/s at 8.38 ms P99; Snowflake commit added ~1 min wall-clock and ~7.5 s P99. Public-internet transport; PrivateLink would reduce further. Borderline-case include on architectural-disclosure grounds: real operational numbers (cluster topology, P50/P99 latencies, per-step attribution) and four first-party tuning findings at mechanism depth. - 2025-06-24 — Why streaming is the backbone for AI-native data platforms
— Thought-leadership / vision essay originally syndicated
to The New Stack, positioning streaming as the "power
grid" of an AI-native data platform. Canonicalises four
architectural propositions the wiki had referenced implicitly:
(1) streaming-as-backbone of an agile data platform
(producer / consumer decoupling + dynamic source/sink add +
real-time reactivity) — new concept
concepts/streaming-as-agile-data-platform-backbone;
(2) CDC fan-out from a single stream to many consumers
(search, analytics, vector index, reactive agent) with the
user_plansdowngrade-trigger worked example and explicit WAL-cleanup-strain trade-off — new pattern patterns/cdc-fanout-single-stream-to-many-consumers; (3) Replayability for iterative RAG — long-lived tiered-storage streams let you re-run historical data through different embedding models or chunking strategies without re-extracting from source — new concept concepts/stream-replayability-for-iterative-pipelines; (4) Open table format = freedom to pick the query engine — Iceberg as the escape hatch from warehouse lock-in, with Snowflake + BigQuery sharing the same dataset via Apache Polaris REST catalog without storing data twice. Also canonicalises schema registry as CI/CD / IaC artefact (PR-time validation, code-owned contracts) and discloses OpenTelemetry context propagation via Kafka record headers as the streaming-boundary analogue of HTTP-header propagation — extending systems/opentelemetry from the Fly.io application-RPC framing. Also names stateless transformation at broker ingress (compliance / masking) and the AI data flywheel (usage → insights → product → usage). 3 canonical new wiki pages: concepts/streaming-as-agile-data-platform-backbone, concepts/stream-replayability-for-iterative-pipelines, patterns/cdc-fanout-single-stream-to-many-consumers. Extends 7 pages: concepts/change-data-capture (new Seen-in canonicalising fan-out topology + WAL-cleanup trade-off + user_plans worked example), concepts/schema-registry (new Seen-in canonicalising CI/CD-IaC-artefact framing — registry as API contract between teams, equivalent to HTTP API contract for sync services), systems/opentelemetry (new Seen-in canonicalising Kafka-record-headers carrier for context propagation at the streaming boundary), patterns/streaming-broker-as-lakehouse-bronze-sink (new Seen-in at vision altitude extending the 2025-01 pedagogy altitude and 2025-04-07 GA-release altitudes), patterns/tiered-storage-to-object-store (new Seen-in canonicalising third axis — economic precondition for replayability — beyond prior capacity + decommission-speed framings), systems/apache-iceberg (new Seen-in canonicalising open-format-escape-hatch from warehouse lock-in + Polaris REST catalog), systems/redpanda-iceberg-topics (new Seen-in at backbone altitude). Tier-3 borderline include. Redpanda vendor voice with heavy product-link density (≈30 blog cross-links to own marketing pages), but architecture content is ~50% of ~1,700-word body and the four propositions above are structurally load-bearing vocabulary the wiki did not previously canonicalise (the backbone framing, the fan-out-from-single-CDC-stream framing, the replayability-for-RAG framing, and the schema- registry-as-CI/CD-artefact framing were all gaps). Passes on vocabulary-canonicalisation grounds even with the marketing- adjacent voice. Cross-source continuity: companion to Gallego 2025-04-03 autonomy essay from the same quarter (Gallego = streaming + MCP + Python SDK as agent substrate; this post = streaming + CDC + Iceberg as AI-data-platform substrate — the agent-substrate and data-substrate halves of the same vision, framed for complementary audiences). Companion to sources/2025-01-21-redpanda-implementing-the-medallion-architecture-with-redpanda|2025-01-21 Medallion architecture post at vision altitude vs mechanism altitude. Companion to sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture|2025-03-18 CDC connectors post — that post canonicalises the CDC reader half of the fan-out pattern; this post canonicalises the consumer-fanout half. Caveats recorded: zero production numbers (no fleet sizes, no latency distributions, no before/after quantitative wins between batch-ETL and streaming); qualitative claims only ("much more effective", "saves you from costly reprocessing"); Iceberg vs Snowpipe-Streaming trade-off named but uncompared on cost / ecosystem / governance; CDC-WAL-cleanup nuance name-only ("delaying WAL cleanup" with no slot-management or retention mechanics); AIOps name-drop without mechanism; no cross-vendor comparison (Kafka / Pulsar / Kinesis / Pub/Sub not compared); unsigned (Redpanda default attribution); originally syndicated to The New Stack as "the power grid for AI-native data platforms" — wiki-version ingest uses the canonical redpanda.com URL. -
2025-06-21 — Behind the scenes: Redpanda Cloud's response to the GCP outage — Production-incident retrospective on the 2025-06-12 GCP global outage from Redpanda Cloud's perspective. ~3-hour incident window (18:41–21:38 UTC); SEV4 incident closed with no customer impact across hundreds of clusters. Load-bearing disclosures: (1) cell-based architecture as an explicit Redpanda Cloud product principle — single- binary broker + per-customer cluster, "Redpanda Cloud clusters do not externalize their metadata or any other critical services"; (2) butterfly effect named as first-class system-design primitive — "GCP's seemingly innocuous automated quota update triggered a butterfly effect that no human could have predicted"; (3) feedback-control- loop-guarded phased rollouts as the change-management discipline — "we try to close our feedback control loops by watching Redpanda metrics as the phased rollout progresses and stopping when user-facing issues are detected"; (4) hedged observability stack — self-hosted data + third-party UI was degraded-but-usable during cascading outage, saving "exponentially bigger cost ramifications" of a vendor failover; (5) SLA substrate decomposition — 99.99% SLA + ≥99.999% SLO decomposes to six concrete choices (replication ≥3, local NVMe primary + async tiered storage, redundant API/Schema Registry/HTTP Proxy, no external critical-path dependencies except PSC, continuous chaos + load testing, feedback-gated phased rollouts); (6) tiered storage as fallback, not primary — elevated GCS PUT error rates did not impact write availability because primary data is on local NVMe; (7) deliberate disk reserve (unused + used-but-reclaimable) absorbs flush backlog during object-store stress. Canonicalises four new patterns: patterns/cell-based-architecture-for-blast-radius-reduction, patterns/preemptive-low-sev-incident-for-potential-impact (19:08 UTC SEV4 declared before customer impact observed), patterns/proactive-customer-outreach-on-elevated-error-rate (20:56 UTC outreach to customers with highest tiered-storage error rates), and patterns/hedged-observability-stack. One affected cluster (staging,
us-central-1, lost node + ~2h replacement) — out of hundreds; customer's production cluster unaffected. Closing thoughts draw a CrowdStrike parallel and argue for "increased adoption of control theory in our change management tools" as an industry-wide reliability practice. Tier-3 on-scope — production-incident retrospective (not marketing) with architecture-density ~60% across timeline + substrate decomposition + six-mitigation reliability practice list. Opens the Redpanda incident-retrospective axis on the wiki. Companion to the 2025-04-03 Gallego autonomy essay (which canonicalised the Data Plane Atomicity invariant) by instantiating the cell- based-architecture deployment shape that operationalises it. Caveats: unsigned, vendor-voice, hindsight-bias acknowledged; single-affected-cluster mechanism underspecified ("uncommon interaction between internal infrastructure components"); no quantitative tiered-storage error-rate metrics; third-party dashboarding/alerting vendor + cloud-marketplace vendor both unnamed; disk-reserve sizing policy undisclosed; PSC exception to no-critical-path-dependencies load-bearing but not walked; phased-rollout-with-feedback-control implementation details absent. -
2025-06-17 — Introducing multi-language dynamic plugins for Redpanda Connect — Launch of the dynamic-plugin framework in Redpanda Connect v4.56.0 (Beta, Apache 2.0). Breaks the Go-only, compile-into-the-binary plugin constraint: plugins now run as separate OS subprocesses communicating with the host Redpanda Connect engine over gRPC on a Unix domain socket, with the cross-process protocol "closely mirroring the existing interfaces defined for plugins within Redpanda Connect's core engine, Benthos". Four canonical new wiki pages: systems/redpanda-connect-dynamic-plugins + two concepts (concepts/subprocess-plugin-isolation — "plugins run in separate processes, so crashes won't take down the main Redpanda Connect engine"; [[concepts/batch-only-component-for-ipc- amortization]] — "we use batch components exclusively to amortize the cost of cross-process communication" — only
BatchInput/BatchProcessor/BatchOutputtypes are exposed across the gRPC boundary) + two patterns (patterns/grpc-over-unix-socket-language-agnostic-plugin as the architectural shape; [[patterns/compiled-vs-dynamic-plugin- tradeoff]] capturing the explicit "compiled plugins for performance-critical, dynamic plugins for flexibility and language choice" guidance — dynamic plugins are additive, not a replacement for compiled plugins). Language SDKs: Go (type-safe, for existing Redpanda Connect developers) and Python (headline target — opens the streaming substrate to PyTorch / TensorFlow / Hugging Face / LangChain / NumPy / SciPy for real-time ML inference inside the pipeline). Motivating use case in the post: a Python processor plugin running a pre-trained BERT model from Hugging Face for sentiment analysis on streaming customer feedback. Launch is Apache 2.0 — the plugin framework itself is open-source; connectors built on top may carry different licenses (contrast: 2025-03-18 CDC input connectors were Enterprise-gated). Extends systems/redpanda-connect with a new## Dynamic plugins (2025-06, Beta, Apache 2.0)section. Tier-3 borderline include: launch / marketing voice with "We're excited..." framing, but architecture content is real — core technical disclosure is the subprocess + gRPC + Unix-socket - batch-only amortization design. Passes on vocabulary-
canonicalisation grounds — four plugin-architecture primitives
(subprocess isolation, batch-only IPC amortization, gRPC-over-
Unix-socket language-agnostic plugin shape, compiled-vs-dynamic
tradeoff) missing from prior wiki coverage. Caveats: Beta
stability only (v4.56.0; protocol stability across minor
versions not guaranteed); no gRPC
.protopublished inline (the protocol "closely mirrors" Benthos interfaces but implementors must consult the SDK source); no performance numbers (no throughput delta vs compiled plugins, no cross-process hop p99, no reference-workload benchmarks); no process-lifecycle details (crash recovery, socket cleanup, supervisor shape unspecified); no horizontal-scaling model for CPU-bound plugins (one subprocess per plugin, no pooling). Opens the Redpanda-Connect extensibility-framework axis on the wiki — prior Redpanda Connect coverage focused on the shipped connector catalog (CDC input connectors, MCP-tool surface); this is the first canonicalizing the developer-surface / plugin-architecture axis. - 2025-05-20 — Implementing FIPS compliance in Redpanda
— Configuration-walkthrough disclosure of broker-level FIPS
140 compliance in self-managed Redpanda clusters on RHEL.
Opens the Redpanda security-substrate axis on the wiki. Three
load-bearing canonicalisations: (1) OpenSSL
3.0.9 as the
FIPS 140-2
validated cryptographic module consumed by both the
redpandabroker binary and therpkCLI, with OpenSSL 3.1.2 (FIPS 140-3 validated) on the late-2025 upgrade roadmap ahead of 140-2 sunset. (2) Three-statefips_modeconfig dial (disabled/enabled/permissive) distinguishing production (OS-FIPS + broker-FIPS), non-regulated (no FIPS), and development (broker-FIPS-only, non-production) deployment shapes.permissiveis explicitly scoped out of compliance claims — entropy sourcing from a non-FIPS OS breaches the boundary even with broker-level controls. (3) Broker-startup fail-fast as the enforcement shape: "Redpanda will log an error and exit if the underlying operating system isn't properly configured." Structurally stronger than the logging-then-enforcement progressive-rollout shape — regulated workloads have no warn-only regime by design. Extends concepts/fips-cryptographic-boundary: the Redpanda instance surfaces at streaming-broker-startup / validated-module altitude where the boundary manifests as a two-package artefact split (redpanda-fips+redpanda-rpk-fipsco-installable with base packages) + three-state config dial + startup enforcement gate — a different architectural layer from the GitHub 2025-09-15 PQ-SSH instance where the boundary manifests as a filtered primitive-advertisement list on the SSH wire. Deployment scope at publication (2025-05-20): self-managed RPM / Debian on RHEL only; Redpanda Cloud, Kubernetes deployments, and Redpanda Connect on roadmap — canonical wiki instance of the FIPS boundary being narrower than a product's full deployment surface because validated- module distribution is deployment-shape-specific. Redpanda Ansible Collection acceptsenable_fips=true+fips_mode=enabledopt-in variables. Batch-skip override per explicit user full-ingest instruction; raw frontmatter carriedingested: true+skip_reason: batch-skip — zero architecture signals in 7896-char body (pure marketing). Post is short (~1,100 words), configuration-walkthrough voice, but canonicalises three compliance-substrate primitives missing from wiki's prior FIPS coverage (anchored only on the GitHub PQ-SSH rollout). Caveats: no wire-protocol disclosure (which ciphers/KEX/MACs filtered in FIPS mode not enumerated); FIPS 140-3 transition schedule underspecified (no formal NIST 2026-02-22 sunset date);permissivefailure surface beyond entropy not enumerated; non-RHEL OS coverage elided; license-gated; no byline; no benchmarks on FIPS-mode overhead. - 2025-05-13 — Getting started with Iceberg Topics on Redpanda BYOC
— BYOC-beta extension of
Iceberg Topics five weeks after 25.1 GA on Dedicated,
with a GCS + BigQuery worked
example. Three new primitives canonicalised: (1) the
per-topic mode configuration
surface —
value_schema_id_prefix(Schema-Registry-wire- format producers → typed Iceberg table),value_schema_latest(latest-schema projection),key_value(schema-lessBYTES+ Kafka metadata); (2) the file-based catalog as a first-class alternative to REST catalog sync for engines (like BigQuery) that read Iceberg via metadata- pointer DDL; (3) the BYOC-data-ownership compound property — customer-owned bucket + broker-projected Iceberg + customer-owned query engine yields "full control of your Iceberg data with zero compromises". Read-side pattern: BigQueryCREATE EXTERNAL TABLE ... format = 'ICEBERG'on a GCS-hostedvN.metadata.json. Adjacent secondary disclosure: Redpanda BYOC doubles partition density per tier in 25.1 via per-partition memory efficiency improvements (Tier 1: 1,000 → 2,000; Tier 5: 22,800 → 45,600), canonicalised as concepts/broker-partition-density. Tutorial altitude with synthetic ProtobufSensorDatagenerator via Redpanda Connect. Tier-3 borderline-on-scope: vendor tutorial, no production numbers, architecture content ~25-30% of body concentrated on the three new primitives + partition-density datum. Passes on vocabulary-canonicalisation grounds (topic-mode configuration, file-based catalog, and BYOC-data-ownership were all gaps in the wiki). Caveats: file-based-catalog mechanism underspecified vs object-store-catalog fallback from GA; partition-density 2× improvement mechanism unexplained;value_schema_id_prefixvsvalue_schema_latestvskey_valuetrade-offs elided; DLQ - schema-evolution not re-invoked in BYOC context; Protobuf-specific guidance thin; tier dimensions opaque.
- 2025-05-06 — A guide to Redpanda on Kubernetes
— Product-altitude guide to Redpanda's Kubernetes deployment
evolution. Three load-bearing architectural claims:
(1) Helm vs Redpanda Operator trade-off on five axes —
managed upgrades + rollback, dynamic configuration (CRDs vs
Helm-values redeploy), advanced health checks + metrics,
lifecycle automation, multi-tenancy. Operator is the default
recommendation; Helm chart retained for simpler deployments.
(2) Two-to-one operator consolidation — Redpanda previously
shipped separate operators for its internal
Redpanda Cloud fleet and for
customer-facing Self-Managed deployments; the 2025 unification
merges them into a single operator
(patterns/unified-operator-for-cloud-and-self-managed).
(3) FluxCD bundling reversal — the customer operator
initially bundled FluxCD internally to wrap
the Helm chart; canonical wiki instance of the
bundled-GitOps-
dependency anti-pattern. Fix across three branches: v2.3.x
FluxCD optional (
spec.chartRef.useFlux) → v2.4.x (Jan 2025) FluxCD disabled by default → v25.1.x FluxCD and Helm-chart wrapping removed entirely. v25.1.x adopts the version-aligned compatibility scheme — operator/chart version matches Redpanda core version with ±1 minor window, retiring the compatibility matrix document. Introduces systems/redpanda-operator as a canonical wiki system and systems/fluxcd as a minimal page. Tier-3 batch-skip override: raw frontmatter carriedingested: true+skip_reason: batch-skip — marketing/tutorial slug pattern; overridden per explicit user full-ingest instruction. Architecture density ~40% on ~1,400-word body. Caveats: product-guide altitude, no production numbers, FluxCD-removal migration path underspecified, deprecation schedule opaque, unified-operator cutover mechanism not disclosed, multi-region K8s limitation (multi-AZ-only) not revisited. Closes a gap in the wiki's Kubernetes-operator corpus by canonicalising two anti-patterns (bundled GitOps, compatibility matrix) that generalise beyond Redpanda. -
2025-04-23 — Need for speed: 9 tips to supercharge Redpanda — Omnibus performance-tuning checklist covering nine tips across three dependency layers — infrastructure (NVMe, dedicated hardware with 95% resource budget, no noisy neighbors; enable broker-side write caching when NVMe isn't available), data architecture (partition skew as Amdahl's Law with three-pronged mitigation — sticky partitioner / keyed only when required / high-cardinality keys; don't compress compacted topics; use tiered storage for fast rebalance), and application design (producer batching, consumer fetch tuning matrix with
fetch.min.bytes/fetch.max.wait.ms/max.partition.fetch.bytes/max.poll.records, offset-commit cost / save-button analogy / RPO-as-commit-frequency, client-side compression with ZSTD or LZ4 codec choice). Introduces concepts/keyed-partitioner, patterns/high-cardinality-partition-key, and patterns/client-side-compression-over-broker-compression. Tier-3 borderline-on-scope: vendor-blog checklist voice but substantive gap-filling across six previously uncanonicalised primitives. No author byline, no production numbers, no customer case study. -
2025-04-07 — Redpanda 25.1: Iceberg Topics now generally available — GA release disclosure for Iceberg Topics across AWS, Azure, and GCP (framed as "first in industry" Kafka-Iceberg streaming solution GA on multiple clouds). Elaborates the 2025-01-21 pedagogy launch with nine disclosed properties beyond the preview framing. Four table-management capabilities: custom hierarchical bucketed partitioning (operator- controllable Iceberg transforms for query-side pruning); built-in dead-letter queues for schema-invalid records (keeps data- quality invariant without dropping batches); full Iceberg- spec-compliant schema evolution (adds/renames/deletes matching the Iceberg spec); automatic snapshot expiry as a broker-owned metadata-GC loop (retires the wiki's prior externalisation-cost caveat for the snapshot-expiry half; small-file compaction ownership remains open). Five catalog-integration capabilities: secure REST catalog sync via OIDC+TLS against Snowflake Open Catalog / Databricks Unity / AWS Glue; transactional writes via Iceberg's commit-protocol serialisation for safe concurrent multi-writer access; automatic table discovery and registration so downstream engines see new Iceberg-configured topics appear without manual
CREATE TABLE; built-in object-store catalog fallback for deployments without a REST catalog; tunable workload management knob for the snapshot-vs-live-topic lag ceiling (making the commit-cadence lag floor an explicit operational parameter). Adjacent 25.1 features: native consumer group lag metrics (Prometheus-exposed, replacing a PromQL compute), Protobuf schema normalization in the Schema Registry, SASL/PLAIN authentication, unified Console+cluster identity with fine-grained RBAC, and FluxCD removal for Kubernetes deployments. Tier-3 borderline-on-scope: vendor launch post, but GA feature disclosure is architecturally substantive — retires two prior wiki caveats (snapshot-expiry ownership, Iceberg-spec-schema-evolution path) and canonicalises three new concepts + two new patterns. Architecture density ~40% on ~1,900-word body. Caveats: vendor framing throughout; "first in industry" unqualified; DLQ operational surface under-specified; transactional-write isolation level unstated; tunable workload management knob name / default / range not disclosed. - 2025-04-03 — Autonomy is the future of infrastructure — Alex Gallego's (founder/CEO) vision essay marking the $100M Series D + Redpanda Agents SDK preview launch. Frames the 20-year systems trajectory (single-node DB → managed SaaS → streaming/log substrate → Iceberg continuous-computation handshake → agent orchestration). Canonicalises Redpanda's founding premise "the truth is the log" (Kleppmann 2015), the send-model-to-data enterprise-AI thesis, the batch/streaming convergence framing, and the frontier-model + local-GPU-minion hybrid. Centerpiece: canonical founder- voice retrospective statement of Data Plane Atomicity as BYOC's central design tenet — "no deployment should be able to bring down any other deployment, including a control plane outage... No externalized consensus algorithm, secret managers, no external databases, no external offset recording service, or metadata look up as you are trying to write your data durably to disk." Reframes MCP from tool-description format to centralised integration proxy, with dynamic Redpanda Connect pipeline filtering (Bloblang + Starlark) as the future fine-grain-ACL mechanism. Introduces three new Redpanda systems on the wiki: systems/redpanda-byoc, systems/redpanda-agents-sdk, plus extends systems/redpanda-connect. Operational numbers: ~300 connectors, ~10× price-performance for fine-tuned small models, single-GPU inference for Llama3/Gemma3/DeepSeekV3/Phi-4, three-cloud BYOC (AWS/GCP/Azure) preview scope. Tier-3 borderline-on-scope; founder-voice vision essay + product-launch hybrid; architecture density ~50% concentrated on Data Plane Atomicity tenet + MCP-as-proxy reframing + log-as-truth founding premise.
- 2025-03-18 — sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture|3 powerful connectors for real-time change data capture
— Product-altitude tour of
Redpanda Connect's four CDC input connectors
(
postgres_cdc,mysql_cdc,mongodb_cdc,gcp_spanner_cdc), each riding on the source database's native change log: Postgres logical replication + replication slot / MySQL binlog with external offset cache / MongoDB change streams + oplog / Spanner change streams with transactional offset storage and dynamic partition split/merge handling. Canonicalises parallel snapshot of a single large table or collection as the Redpanda differentiator vs stock Debezium: "Debezium (Kafka Connect) does not do this today." Ships the parallel-snapshot capability in the Postgres and MongoDB connectors; MySQL and Spanner connectors don't have it at publication. MySQL CDC topology scope explicitly limited (no GTID, no Group Replication, no multi-source). Second canonical wiki instance of CDC driver ecosystem — from the consumer-side (Redpanda Connect writes drivers against every source database's native CDC API), bracketing the Vitess-VStream emitter-side instance already canonicalised. Tier-3 on-scope on engine-mechanism canonicalisation grounds; architecture density ~60% of a feature-tour post; four new canonical concept pages + one new system page + one sub-concept pattern extension. Enterprise-license-gated in Redpanda Cloud + Self-Managed. - 2025-02-11 — High availability deployment: Multi-region stretch clusters
— Part four of Redpanda's HA/DR series. Canonicalises the
multi-region stretch cluster as the RPO=0 shape (single
Redpanda cluster spans regions; per-partition Raft quorum on
every write; automatic leader re-election on region loss).
Positions it on the consistency-vs-availability axis against
MirrorMaker2 async two-cluster replication (non-zero RPO,
per-cluster availability). Canonicalises four operator knobs
for cross-region cost mitigation: leader pinning
(enterprise feature; bias leadership to client-proximal
region),
acks=1(producer durability relaxation), follower fetching (KIP-392 closest-replica consume), remote read replica topic (object-storage-backed read-only mirror cluster). Publishes a three-broker Ansiblehosts.initemplate with region-as-rack (rack=us-west-2,rack=us-east-2,rack=eu-west-2) and an OMB +tc-inter-broker-latency-injection simulation technique for multi-region performance testing without paying cross-region cloud bandwidth. Current limitation: Self-Managed on K8s is multi-AZ only; multi-region stretch is available on VMs / bare metal / cloud compute / Redpanda Cloud. - 2025-01-21 — Implementing the Medallion Architecture with Redpanda — pedagogy-altitude explainer on Databricks' three-tier Bronze/Silver/Gold data-lake pattern, positioning Redpanda's Iceberg topics as the mechanism that makes the streaming broker serve as the Bronze layer of a lakehouse without any external ETL (Airflow / Kafka Connect / Redpanda Connect). Canonicalises concepts/medallion-architecture, concepts/data-lakehouse, concepts/iceberg-topic, concepts/open-file-format on the wiki. Names Flink's Iceberg sink connector as the mechanism for real-time Bronze→Silver→Gold transitions (patterns/stream-processor-for-real-time-medallion-transitions). Tier-3 pedagogy altitude; no production numbers; no compaction-ownership / commit-cadence latency numbers.
- 2024-11-26 — Batch tuning in Redpanda to optimize performance (part 2)
— James Kinley's operations-manual companion to part 1.
Canonicalises four Prometheus private metrics
(
vectorized_storage_log_written_bytes,vectorized_storage_log_batches_written,vectorized_scheduler_queue_length,redpanda_cpu_busy_seconds_total) + five PromQL one-liners + the 4 KB NVMe-alignment batch-size floor + the write-caching broker feature + a real customer case study showing p99 128 ms → 17 ms and 2-cluster → 1-cluster consolidation at ~2.2× per-cluster throughput. - 2024-11-19 — Batch tuning in Redpanda for optimized performance (part 1)
— James Kinley's first-principles explainer on producer-side
batching. Canonicalises the fixed-vs-variable request-cost
framing, the
linger.ms/batch.size/buffer.memorytrigger logic, and the seven-factor effective-batch-size framework.
Related¶
- companies/index
- systems/kafka — Redpanda implements Apache Kafka's wire protocol; most content applies to both systems.