CONCEPT Cited by 1 source
BYOC data ownership for Iceberg¶
BYOC data ownership for Iceberg is the property that falls out when Redpanda BYOC (data plane in the customer's cloud account) is combined with Iceberg Topics (broker writes Parquet + Iceberg metadata to object storage): the customer's own bucket holds both the data files and the Iceberg metadata, with the customer's IAM, KMS keys, and lifecycle rules applying directly. Canonical framing from the source verbatim:
"For BYOC customers who already control their own object storage buckets, this means full control of your Iceberg data with zero compromises." (Source: sources/2025-05-13-redpanda-getting-started-with-iceberg-topics-on-redpanda-byoc)
This is a specialisation of BYOC's canonical Data Plane Atomicity property to the Iceberg data-output surface: the customer's analytical data, not just the streaming data, lives end-to-end inside the customer's cloud-provider trust boundary.
What "full control" buys¶
Three practical consequences:
Direct query access¶
The customer's own analytics engines — BigQuery, Amazon Athena, Snowflake external stages, Spark on EMR, Trino, DuckDB, anything — can open the Iceberg metadata file directly from the customer's bucket with the customer's cloud IAM credentials. No Redpanda-hosted REST catalog sits in the query path. When combined with file-based catalog, the Iceberg data-producer (the broker) and the Iceberg data-consumer (the customer's query engine) share a bucket but have no middleman service between them.
Data-residency compliance¶
For regulated workloads (PII, PHI, financial, sovereign-cloud):
- Storage-at-rest is in the customer's project under the customer's KMS keys — Redpanda's data plane writes with customer-scoped IAM; the customer never has to trust Redpanda with KMS key material.
- Auditing happens on the customer's audit log — object accesses land in the customer's Cloud Audit Logs / CloudTrail.
- Cross-cloud / cross-region data-movement controls — bucket location, VPC service controls, and egress rules are all customer-configured on a customer-owned resource.
This matters most for regulated customers whose compliance posture won't accept a vendor-operated storage trust anchor. BYOC already moves the streaming data plane inside this trust boundary; BYOC-Iceberg extends the property to the lakehouse data plane.
Bucket-lifecycle / tiering ownership¶
- Object Lifecycle rules (GCS, S3 Lifecycle) — transition older Iceberg data to colder / cheaper storage classes on schedules the customer controls.
- Cross-region replication — configure GCS Turbo Replication / S3 CRR on the bucket directly; the broker's writes replicate automatically.
- Intelligent Tiering — use S3 Intelligent-Tiering / equivalent without coordinating with Redpanda.
The trade-off with open-table-format-on-object-storage applies — object-level features don't understand Iceberg table semantics, so tiering can still create per-row query-latency unpredictability. But the customer owns that trade-off, where in a Dedicated (non-BYOC) deployment the vendor owns it.
What this is not¶
- Not a product-feature distinction. This concept describes the compound property of two existing product features (BYOC + Iceberg Topics) rather than a new Redpanda capability.
- Not a security claim against Redpanda. Even in non-BYOC Redpanda Cloud, Iceberg data lives in customer-controlled storage in many deployment shapes. BYOC changes the execution path (broker runs in the customer's account), not just the storage destination.
- Not a cost claim. The customer pays the cloud provider directly for bucket storage + egress either way; BYOC data ownership is an architectural property, not a pricing lever.
Where this compounds¶
The data-ownership property compounds across multiple axes when BYOC is combined with other features on this wiki:
- Data Plane Atomicity — BYOC's core tenet; write path has no runtime dependency on externalised services. For Iceberg workloads, this extends to the file-based catalog shape, where even the catalog is a customer-owned object, not a Redpanda-operated service.
- Send-model-to-data — Gallego's 2025-04-03 framing. Private data (including Iceberg tables) stays in the customer's VPC; models come to the data via in-VPC inference. BYOC-Iceberg is the data stratum of the same trust-boundary argument.
- concepts/digital-sovereignty — BYOC addresses sovereign- cloud / regulated-cloud requirements on the streaming side; BYOC-Iceberg extends the same posture to analytics.
Costs / caveats¶
- Bucket-misconfiguration risk shifts to the customer. Overly permissive bucket policies, accidental public-read, misrouted KMS encryption — the failure modes are now customer-owned. Redpanda's operational oversight applies to the broker, not the customer's bucket ACL hygiene.
- Bucket-availability risk shifts to the customer. GCS / S3 regional outages hit customer-owned buckets identically; the broker's Iceberg-write path blocks or buffers as the bucket recovers.
- Cost visibility fragments across two bills — customers now see Redpanda control-plane costs (on Redpanda's bill) plus object-storage costs (on the cloud provider's bill). Chargeback and budgeting get more complex.
- Data retrieval and migration asymmetry. If the customer leaves Redpanda, the Iceberg tables stay in their bucket — which is a lock-in advantage for the customer, not the vendor. But the catalog coordination (who owns the REST catalog metadata if any was used) is a separate migration axis.
- Policy is stored per-bucket, not per-table. Iceberg table-level ACLs (as in Unity / Polaris) aren't available on file-based-catalog direct-read paths; the customer's IAM budget has to hold the granularity they need.
Seen in¶
- sources/2025-05-13-redpanda-getting-started-with-iceberg-topics-on-redpanda-byoc — canonical wiki disclosure. Redpanda 2025-05-13 BYOC-beta Iceberg Topics walkthrough frames customer-owned-bucket + broker- projected-Iceberg as "full control of your Iceberg data with zero compromises". The GCS + BigQuery demo is the minimal worked example of the customer-bucket + customer-query-engine path that does not cross a Redpanda trust boundary.
Related¶
- systems/redpanda-byoc — the deployment model that provides the customer-VPC data plane.
- systems/redpanda-iceberg-topics — the feature the property composes with.
- concepts/data-plane-atomicity — BYOC's canonical tenet that this concept specialises for Iceberg.
- concepts/iceberg-topic — the streaming-to-Iceberg primitive.
- concepts/iceberg-file-based-catalog — the catalog shape most aligned with BYOC data ownership.
- concepts/digital-sovereignty · concepts/managed-data-plane — adjacent BYOC properties.
- systems/google-cloud-storage · systems/aws-s3 — the object stores BYOC customers typically own.
- patterns/streaming-broker-as-lakehouse-bronze-sink — the architectural pattern this concept configures the trust boundary for.