Skip to content

CONCEPT Cited by 1 source

Iceberg topic mode

Iceberg topic mode (Redpanda topic-level configuration redpanda.iceberg.mode) selects how the broker projects a Kafka record into an Iceberg row when Iceberg Topics is enabled (iceberg_enabled: true). Three modes are defined:

Source: sources/2025-05-13-redpanda-getting-started-with-iceberg-topics-on-redpanda-byoc.

The three modes

  • value_schema_id_prefix — record value is expected to carry a Schema Registry wire-format prefix (magic byte + 4-byte schema ID). The broker reads the referenced schema, decodes the payload, and projects into a typed Iceberg table whose columns match the registered schema. Producers must write using the Schema Registry wire format (via kafka-avro-console-producer / SerializingProducer / the Redpanda Connect schema_registry_encode processor).
  • value_schema_latest — broker uses the latest-version schema of a registered subject without requiring the producer to prefix each record with a schema ID. The producer ships a plain serialised payload; the broker applies the latest schema at projection time.
  • key_value — broker writes the raw key + value + Kafka metadata (offset, partition, timestamp) into the Iceberg table without attempting to decode the payload into typed columns. Schema-less ingestion mode.

Verbatim framing from the source:

"Check that your Redpanda topic is configured with iceberg_enabled set to true and select the right redpanda.iceberg.mode (e.g., value_schema_id_prefix, value_schema_latest, or key_value). This configuration instructs Redpanda to write the topic data in the Iceberg format to the configured Tiered Storage location."

Trade-off axis

The modes form a structure-vs-coupling spectrum:

Mode Producer coupling Downstream ergonomics
value_schema_id_prefix Must use SR wire format Typed columns, full schema-evolution support
value_schema_latest Plain serialised payload Typed columns but producers don't participate in schema versioning
key_value Any bytes Raw columns (key BYTES, value BYTES + metadata)
  • value_schema_id_prefix gives the best downstream analytics ergonomics (typed columns; safe Iceberg-spec schema evolution at the broker) but forces the producer side to adopt Schema Registry wire-format serialisation. Best fit for green-field streaming pipelines where the producer is also Redpanda-native (Redpanda Connect, Kafka clients with SR-aware serialisers).
  • value_schema_latest relaxes the producer-side requirement but loses per-record schema-version tracking — downstream schema-change tolerance becomes the latest-schema-only surface.
  • key_value is schema-less — the Iceberg table holds opaque BYTES columns plus Kafka metadata. Useful for backup / archival / replay workloads where the payload shape is heterogeneous or outside the producer's control, at the cost of forcing every downstream reader to decode the bytes themselves.

Architectural role

This is the configuration-surface primitive behind Iceberg Topics' "schema optional" property canonicalised on concepts/iceberg-topic. The broker decides whether to project typed columns by reading the topic-level redpanda.iceberg.mode + iceberg_enabled pair; mode selection determines whether the Iceberg table's schema is inherited from a registered Schema Registry subject or constructed as the minimal Kafka-metadata envelope.

The mode is also the entry point through which broker-level schema evolution (as disclosed at 25.1 GA) flows: a value_schema_id_prefix topic gets Iceberg-spec schema evolution (adds / renames / deletes) tracking the Schema Registry's compatibility envelope; a key_value topic has no broker-participated schema evolution at all (the Iceberg table's schema is fixed).

Relationship to Iceberg catalogs

Iceberg topic mode is orthogonal to the catalog-integration surface: any of the three modes can be paired with REST catalog sync (concepts/iceberg-catalog-rest-sync) or file-based catalog. Mode controls the schema side of the Iceberg projection; catalog configuration controls the publication side (where the table's snapshot pointer lives).

Costs / caveats

  • value_schema_id_prefix producer-coupling cost. Every producer must use the Schema Registry wire format; non-SR-aware clients (or kafka-console-producer without SR configuration) will produce records that fail schema decode and (at 25.1 GA) route to the built-in DLQ.
  • Schema Registry availability in the write path. value_schema_id_prefix + value_schema_latest both couple broker Iceberg-projection to Schema Registry availability. Unreachable SR → records route to DLQ, not the Iceberg table. key_value mode escapes this coupling.
  • Mode change is a breaking change for downstream readers. Switching an iceberg_enabled topic between modes changes the Iceberg table's schema shape; readers expecting typed columns will break on a switch to key_value and vice versa. No discussion in the source of hot-reconfiguration safety.
  • Kafka-client-serializer interaction under schema evolution still open. The 2025-04-07 GA ingest left open "how Iceberg-topic schema changes interact with Kafka-client serializers (Avro / JSON Schema / Protobuf via a schema registry)"; this concept canonicalises the mode surface but not the full schema-evolution compatibility envelope.
  • Mode default not stated. The 2025-05-13 source walks value_schema_id_prefix as the demo choice without disclosing the system default.
  • Schema format support. Protobuf is walked in the demo; Avro and JSON Schema are implied by the Schema Registry integration but not separately enumerated.

Seen in

Last updated · 470 distilled / 1,213 read