PATTERN Cited by 1 source
Sink connector as complement to broker-native integration¶
A streaming platform ships two integration surfaces for the same downstream system: (1) a broker-native integration optimised for the platform's own protocol and offering zero-ETL convenience, and (2) a sink connector that trades some of that convenience for flexibility — polymorphic sources, in-stream transformation, dynamic routing. The two surfaces are complements, not substitutes — each covers shapes the other cannot.
Canonical instance: Redpanda → Iceberg¶
Redpanda ships two distinct paths from its streaming layer into Apache Iceberg tables:
| Property | Iceberg Topics (broker-native) | Iceberg output (sink connector) |
|---|---|---|
| Primary value | Zero-ETL convenience | Integration flexibility |
| Sources | Redpanda Streaming topics only | 300+ Redpanda Connect inputs (HTTP, CDC, SQS, Kinesis, Pub/Sub…) |
| Schema evolution | Registry-driven (Avro/Protobuf/JSON schemas in Schema Registry) | Registry-less, data-driven from raw JSON |
| Routing | 1 topic → 1 table | Multi-table (Bloblang-interpolated) |
| Infrastructure | Zero extra components (in broker) | Stateless Redpanda Connect container on K8s |
| In-stream transforms | None (records land as-is) | Bloblang + Starlark processors before landing |
| Availability / licensing | Redpanda Cloud BYOC or Self-Managed EE | Redpanda Connect Enterprise tier |
Source: sources/2026-03-05-redpanda-introducing-iceberg-output-for-redpanda-connect.
Why both exist¶
The launch post frames it directly verbatim:
"If you're already running Redpanda, you might already be familiar with Iceberg Topics. They give you a zero-ETL path from broker to table that's streamlined for high-speed Kafka streams. Produce to a topic, and Redpanda handles the rest. For many workloads, that's all you need.
But maybe your data arrives from an HTTP webhook, a Postgres CDC stream, or a GCP Pub/Sub subscription. Maybe you need to normalize a payload, drop PII, or split a mixed event stream by type before anything hits the lakehouse. That's the gap this connector fills."
The broker-native path optimises for its own protocol's happy path. The sink connector fills the gap for everything else — non-native sources, transformation-required flows, and polymorphic-destination pipelines.
Structural asymmetry¶
The two surfaces are not equivalent — each makes architectural choices the other doesn't:
- Broker-native collapses the table into the streaming substrate itself: no separate container, no separate lifecycle, no commit coordination across processes. The cost is rigidity (Kafka protocol only, registry-driven contract, 1-to-1 routing).
- Sink connector externalises the integration: it runs as its own process, can be replicated independently, gets its own isolation boundary. The cost is one more component to operate, and the connector must do its own commit coordination against the Iceberg catalog.
Neither is universally superior. High-throughput Kafka-protocol streams with well-defined schemas favour the broker-native shape; polymorphic or transformation-heavy streams favour the connector.
Generalisation beyond Iceberg¶
The same pattern can be observed (or is observable by analogy) across several streaming-platform ↔ downstream-system pairs:
- Kafka → relational databases — in the Kafka ecosystem, Kafka Connect sink connectors are the broker-external shape; per-database CDC-reverse projects (e.g. WarpStream's native Postgres sink, Debezium Server's JDBC sink) are attempts at the broker-native shape.
- Redpanda Connect CDC → Redpanda topics — the
postgres_cdc/mysql_cdc/mongodb_cdc/gcp_spanner_cdcconnectors themselves are sink-connector-altitude CDC readers from external DB engines into Redpanda topics, the inverse of the Iceberg output. - Snowflake Streaming —
snowflake_streamingoutput (sources/2025-10-02-redpanda-real-time-analytics-redpanda-snowflake-streaming|2025-10-02 benchmark) is the sink-connector shape into Snowflake; Snowflake has no broker-native counterpart, but the shape is architecturally identical to the Iceberg output.
The general pattern: a streaming platform's integration layer ships connectors for destinations where a broker-native shape is either impossible (destination speaks only REST/SQL) or impractical (destination's semantics require transformation beyond what the broker is willing to do inline).
Trade-off summary¶
| Axis | Broker-native | Sink connector |
|---|---|---|
| Latency floor | Lower (no extra hop) | Higher (one process boundary) |
| Operational surface | Smaller | Larger (extra container) |
| Source diversity | Native protocol only | Arbitrary |
| Transformation | None inline | Bloblang / Starlark / language-agnostic plugins |
| Schema discipline | Contract-first (registry) | Reflection-first (data-driven) |
| Isolation | Couples to broker lifecycle | Independently scalable / upgradable |
Seen in¶
- Redpanda — Introducing Iceberg output for Redpanda Connect (2026-03-05) — canonical wiki instance. Explicit two-shape comparison table canonicalised verbatim; launch post frames the two paths as complementary, not competing.
Related¶
- systems/redpanda-connect-iceberg-output — the sink-connector side of the canonical instance
- systems/redpanda-iceberg-topics — the broker-native side
- systems/redpanda-connect
- systems/apache-iceberg
- concepts/iceberg-topic
- concepts/iceberg-catalog-rest-sync
- patterns/streaming-broker-as-lakehouse-bronze-sink — composable parent pattern
- patterns/broker-native-iceberg-catalog-registration — the broker-native side's registered pattern
- patterns/bloblang-interpolated-multi-table-routing — the capability the sink side brings that the broker-native side can't