SYSTEM Cited by 2 sources
Google BigQuery¶
Google BigQuery is Google Cloud's
serverless enterprise data warehouse — a column-store OLAP engine
with compute-storage separation, SQL-native query surface, and
integrations with GCS and other
GCP data sources. On this wiki, BigQuery surfaces as an
Iceberg-reading query engine via its CREATE EXTERNAL TABLE
... format = 'ICEBERG' primitive against Iceberg tables written
to GCS by a streaming or batch producer.
Product docs: cloud.google.com/bigquery.
Architectural role on this wiki¶
BigQuery is a canonical consumer of
external
table over Iceberg metadata pointer — the pattern where a query
engine registers an Iceberg table by pointing at a specific
vN.metadata.json file in object storage rather than going
through a REST catalog service.
Worked DDL (verbatim from the source demo):
CREATE EXTERNAL TABLE YOUR_PROJECT_ID.YOUR_BIGQUERY_DATASET.YOUR_TABLE_NAME
WITH CONNECTION 'YOUR_FULL_CONNECTION_ID'
OPTIONS (
format = 'ICEBERG',
metadata_file_paths = ['gs://your-bucket-name/path/to/your/iceberg/table/metadata/vX.metadata.json']
);
The YOUR_FULL_CONNECTION_ID is a
BigQuery Cloud Resource Connection
— the IAM-layer construct that lets BigQuery read the external GCS
bucket with the right scoped permissions.
Why it shows up in this corpus¶
BigQuery was named in Redpanda's 25.1 GA post as one of the Iceberg-compatible query engines (alongside ClickHouse, Snowflake, Databricks, Dremio, Spark SQL, Flink, Trino) but was not the featured integration for the GA walkthrough. The BYOC tutorial (2025-05-13) is where BigQuery becomes the canonical wiki worked example of a catalog-protocol-agnostic Iceberg reader — one that uses a file-based metadata pointer rather than a REST catalog to find the table.
This is the read-side surface of Redpanda's BYOC → GCS → Iceberg
pipeline walked in the source: Redpanda Iceberg topic projects
row-oriented records into Parquet on GCS + writes Iceberg
metadata; BigQuery's CREATE EXTERNAL TABLE opens the metadata
JSON and serves SQL queries against the Parquet files without a
Redpanda REST catalog in the query path.
Minimal-viable stub¶
This is a stub page anchored by a single Iceberg-external-table source. BigQuery's internals (Dremel execution engine, Capacitor column store, Borg scheduling, slot-based billing model) aren't canonicalised here; a deeper treatment waits on a dedicated BigQuery-internals source. The pattern's system role on this wiki is narrow: "the query engine that reads Iceberg via file- based catalog."
Caveats¶
- External tables show a static snapshot pointer. Per the
source: "update the external table definition in BigQuery if
the location of the latest metadata file changes or you want
to query a newer snapshot of the table data." New producer
snapshots don't auto-propagate without a refresh mechanism
(scheduled DDL re-run, event-driven refresh, or
ALTER EXTERNAL TABLE). - Cost-per-query on external tables. BigQuery charges for the data scanned; Iceberg external tables can scan the full table if queries aren't shaped to hit partition / manifest pruning. Partition-pruning hygiene is a cost lever.
- Limited compared to BigLake / BigLake Iceberg tables. BigLake offers richer managed-table integration than the basic external-table path (including Iceberg REST catalog integration); the source's tutorial uses the simpler external-table shape.
- Cloud-bound. BigQuery reads data out of GCS efficiently but reading Iceberg tables from S3 (cross-cloud) requires network egress and a different connection-credential shape.
BigLake metastore as REST-catalog alternative¶
As of 2025-11-06 (Redpanda 25.3), Redpanda Iceberg Topics adds support for Google BigLake metastore as a REST-catalog option alongside the file-based pointer shape canonicalised above. The two shapes coexist:
- File-based catalog (2025-05-13 BYOC beta path above) —
BigQuery points at a specific
vN.metadata.json; new producer snapshots require an external-table refresh. - REST catalog via BigLake (25.3 path) — BigQuery discovers tables via BigLake metastore + Dataplex governance; producer snapshots auto-propagate to reader catalog views.
See systems/google-biglake for the metastore system page.
Seen in¶
- sources/2025-11-06-redpanda-253-delivers-near-instant-disaster-recovery-and-more
— BigLake metastore as the GCP REST-catalog reader-side
shape. Redpanda 25.3 Iceberg Topics register into BigLake;
BigQuery discovers the resulting streaming-produced tables
without
CREATE EXTERNAL TABLEDDL. - sources/2025-05-13-redpanda-getting-started-with-iceberg-topics-on-redpanda-byoc
— canonical wiki disclosure. BigQuery
CREATE EXTERNAL TABLE ... format = 'ICEBERG'over a GCS-hosted Iceberg metadata file as the reader-side integration for a Redpanda BYOC Iceberg topic using the file-based catalog option.
Related¶
- systems/google-cloud-storage — the object store holding the Iceberg metadata JSON and Parquet data files.
- systems/apache-iceberg — the table format BigQuery reads.
- systems/redpanda-iceberg-topics — the canonical producer in the source.
- concepts/iceberg-file-based-catalog — the catalog shape BigQuery reads via this path.
- concepts/open-table-format · concepts/oltp-vs-olap — architectural context.
- patterns/external-table-over-iceberg-metadata-pointer — the pattern BigQuery's external-table primitive instantiates.
- systems/snowflake · systems/clickhouse — sibling Iceberg-compatible query engines named in the Redpanda GA post.