SYSTEM Cited by 2 sources

Godzilla (Dropbox 7th-gen dense multi-GPU platform)¶

Definition¶

Godzilla is Dropbox's 7th-generation dense multi-GPU platform, rolled out 2025. Supports up to 8 interconnected GPUs per server and is positioned for high-throughput ML training and LLM fine- tuning workloads. Complements systems/gumby (flexible, inference- leaning) at the opposite end of the accelerator tier.

Target workloads¶

LLM testing — benchmark runs, regression tests on models.
LLM fine-tuning — adapting foundation models to Dropbox- specific data.
Other high-throughput ML workflows — anywhere multi-GPU parallelism is load-bearing.

Interconnect¶

Up to 8 GPUs "interconnected" — the post doesn't specify NVLink vs NVSwitch vs Infinity Fabric. For LLM training, interconnect bandwidth is the usual bottleneck after raw FLOPs; Dropbox is deliberately building for that class of workload rather than for inference-fanout.

Why Dropbox is building this¶

Framed as a product-first hardware investment: Dash and related AI products demand dedicated ML training capacity. Quote from the 7th-gen post:

Features like intelligent previews, document understanding, fast search, and video processing (plus more recent work with large language models) all require serious computing muscle... These workloads demand high parallelism, massive memory bandwidth, and low-latency interconnects — requirements that traditional CPU-based servers can't economically support.

Relationship to Gumby¶

Gumby and Godzilla are the two accelerator SKUs in the 7th-gen rollout. Gumby is a Crush variant with PCIe GPU slots optimized for variable-TDP inference/embedding workloads; Godzilla is a purpose- built dense-GPU server. The split encodes a general principle: accelerator platforms should be product tiers keyed to workload shape, not one "GPU server" SKU.

Software axis — low-bit training/inference¶

Godzilla's dense multi-GPU shape is the substrate for low-bit LLM fine-tuning and high-throughput inference. On Blackwell-class silicon it runs MXFP / NVFP4 workloads natively via Tensor Core block_scale-enabled MMAs; pre-MXFP GPUs run AWQ/HQQ-style A16W4 with explicit software dequantization. Format portability (sm_100 tcgen05.mma vs sm_120 mma.sync) is an active operational concern for Dropbox's kernel stack (gemlite) across this tier (Source: sources/2026-02-12-dropbox-how-low-bit-inference-enables-efficient-ai).

Seen in¶

sources/2025-08-08-dropbox-seventh-generation-server-hardware — 7th-gen hardware introduction; Godzilla's role in LLM training/ fine-tuning.
sources/2026-02-12-dropbox-how-low-bit-inference-enables-efficient-ai — the quantization stack running on Godzilla-class training/fine-tuning hardware; MXFP/NVFP native support on Blackwell.