SYSTEM Cited by 1 source
Hotstar PubSub (50M concurrent real-time fanout)¶
Hotstar PubSub is Hotstar's in-house real-time message-fanout infrastructure that delivers server-pushed messages to connected clients over a WebSocket-class channel. Per the 2024 article, PubSub was "built at Hotstar to deliver messages to users in our Social Feed" and is documented in the separate Building PubSub for 50M Concurrent Socket Connections post. Scale claim: 50 million concurrent socket connections.
Not to be confused with Google Cloud Pub/Sub — this is Hotstar's own product, named the same.
Role in the emoji-swarm pipeline¶
Downstream of Spark Streaming concepts/micro-batching|2-second aggregate output, a Python Kafka consumer reads per-batch counts, trims to top-N most-popular emojis, and pushes them to PubSub. PubSub fans the trimmed payload out to every connected client, which animates the emoji swarm on the Social Feed.
Trimming to top-N at the consumer before PubSub — rather than letting PubSub broadcast full count vectors — bounds per-socket bandwidth regardless of the long-tail shape of user submissions.
Why not just use a generic message-bus?¶
Hotstar's live-sports concurrency profile (cricket finals push upwards of ~25 M concurrent viewers) means the fanout problem is dominated by millions-of-subscribers-per-topic, not by publisher throughput or broker durability. That's the regime where generic brokers (Kafka, NATS, Redis pub/sub) stop being cheap: subscriber counts, not message counts, drive cost. PubSub is specialised for this regime — the 50M concurrent sockets figure is the operational justification.
Seen in¶
- sources/2024-03-26-highscalability-capturing-a-billion-emojions — the delivery tier of the emoji-swarm / Voting real-time aggregation pipeline. Decouples Spark micro-batch cadence from client animation cadence; receives top-N emoji payloads pushed by a Python Kafka consumer.
Related¶
- systems/hotstar-emojis — the consumer system that pushes into PubSub.
- systems/hotstar-knol — the sibling Hotstar Kafka data platform.
- concepts/fanout-and-cycle — the general shape of one-to-many real-time delivery.