SYSTEM Cited by 1 source
MobileNet¶
MobileNet is Google's family of efficient CNN architectures designed for on-device (mobile / embedded) inference. The defining architectural choice is depthwise-separable convolutions — factoring a standard convolution into a depthwise stage (one filter per input channel) followed by a 1×1 pointwise convolution — which drops parameter count and multiply-add count by roughly an order of magnitude at comparable accuracy on image-classification tasks. Three major versions: v1 (2017, arXiv:1704.04861), v2 (2018, inverted residuals + linear bottlenecks), v3 (2019, NAS-discovered blocks + h-swish + squeeze-and-excitation).
Stub page. The sysdesign-wiki's current ingest on MobileNet is as a building block of a larger distillation pipeline (YouTube real-time generative AI effects). Per-version architectural detail, latency-vs-accuracy numbers, and quantisation / compilation behaviour are not in the raw. This page will expand as more sources that actually measure MobileNet-family behaviour are ingested.
Why the sysdesign-wiki cares about MobileNet¶
MobileNet exists because of an on-device inference constraint. Standard CNN architectures (VGG, ResNet) were designed on datacentre GPUs and don't fit mobile compute / memory / battery budgets. MobileNet is the canonical wiki instance of "architecture selected by the serving substrate, not by the training task" — the model class exists because phones exist, not because the vision task required it. This makes MobileNet recurring wiki vocabulary for on-device ML discussions:
- Encoder backbone for image classification / detection / segmentation on mobile.
- Student backbone in distillation pipelines targeting mobile inference.
- Block primitive reused inside other architectures (e.g. MobileNet-block decoders in UNet-style image-to-image students on mobile).
Usage in YouTube's real-time generative AI effects¶
The 2025-08-21 post names MobileNet twice (Source: sources/2025-08-21-google-from-massive-models-to-mobile-magic-tech-behind-youtube-real-time-generative-ai):
- As the encoder backbone of YouTube's on-device student model — "a design known for its performance on mobile devices".
- As the block primitive for the student's decoder — "a decoder that utilizes MobileNet blocks".
Both are UNet components; MobileNet is the substrate choice inside UNet's encoder / decoder slots rather than a standalone model.
Reported metrics (from the ingested source set)¶
No MobileNet-specific latency / accuracy / parameter numbers are disclosed in the 2025-08-21 YouTube post. For canonical MobileNet benchmarks see the upstream v1/v2/v3 papers on arXiv.
Seen in¶
- sources/2025-08-21-google-from-massive-models-to-mobile-magic-tech-behind-youtube-real-time-generative-ai — MobileNet used as encoder backbone and decoder block in YouTube's on-device student model for real-time generative AI effects.