Hotstar Emojis (Social Feed real-time emoji swarm)¶

SYSTEM Cited by 1 source

Hotstar Emojis is the in-house real-time emoji-swarm feature on Hotstar's Social Feed — the live-sports OTT platform's equivalent of stadium placard-waving. Fans tap emojis while watching a match; the platform aggregates billions of submissions per tournament into a system-wide "mood" signal and pushes the changing swarm back to every concurrent viewer in near real-time.

Hotstar replaced a third-party vendor with this in-house system. It handled ~5 billion emojis from 55.83 M users during the ICC Cricket World Cup 2019 alone, and >6.5 B lifetime at the time of the 2024 writeup. (Source: sources/2024-03-26-highscalability-capturing-a-billion-emojions.)

Architecture¶

Four decomposed stages behind load balancers with autoscaling:

HTTP ingest (Golang). Clients POST emoji submissions over HTTP. The Go service writes each message to an in-process channel and returns 200 immediately. A background goroutine runs the Kafka producer (see patterns/async-buffered-kafka-produce), flushing every 500 ms OR every 20,000 messages, whichever is first. Client libraries: Confluent confluent-kafka-go or Shopify Sarama.
Message queue (Knol / Kafka). Raw emoji submissions land on a Kafka topic managed via Knol, Hotstar's internal managed-Kafka data platform. Kafka chosen over alternatives for "high throughput, availability, low latency and support for consumer groups."
Stream processing (Spark Streaming). Spark job consumes Kafka and emits concepts/micro-batching|2-second micro-batched aggregates (emoji counts per interval) to a second Kafka topic. Spark picked over Flink/Storm/Kafka-Streams for micro-batch-and-aggregate fit and community support. (The choice is period-specific; see caveats on the source page.)
Delivery. A Python Kafka consumer reads the aggregate topic, normalises, picks the top-N most popular emojis, and pushes them to PubSub — Hotstar's in-house WebSocket fanout service built to deliver messages over 50M concurrent socket connections. Clients animate the emoji swarm from the PubSub stream.

Design principles named in the post¶

Scalability — horizontal, load-balanced, autoscaled components.
Decomposition — HTTP ingest, queue, stream processor, delivery are independently scalable services. Failure at one stage doesn't cascade: a Spark GC pause doesn't stall HTTP ingest because HTTP never waited for Spark synchronously.
Asynchronous processing — "execution without blocking resources and thus supports higher concurrency." The async-buffered Kafka producer is the concrete instance of this principle.

Delivery semantics¶

At-most-once on the ingest path. If the Go process or goroutine dies between a client's chan <- msg and the next Kafka flush, up to min(500 ms, 20K messages) of submissions are lost. Hotstar explicitly picks this over synchronous ack because "we need very low latency and data loss in rare scenarios is not a big concern (although we haven't seen any so far)." See concepts/at-most-once-delivery for the class, and patterns/async-buffered-kafka-produce for the implementation shape.

Generalisation to Voting¶

Hotstar reframes the pipeline as "process quantifiable user responses in near real-time" — the same infrastructure powers Voting for Bigg Boss (Telugu/Tamil/Malayalam) and Dance Plus, with ~3 billion votes processed to date. Polls and Trivia slot in on the same platform. Only the aggregation function (per-candidate count vs per-emoji count) and the delivery endpoint change. This is the source's core architectural lesson: build the verb (quantifiable response aggregation), not the noun (emojis).

Operational numbers¶

~5 B emojis / 55.83 M users — ICC Cricket World Cup 2019.
>6.5 B emojis lifetime at 2024.
~3 B votes on the generalised platform (Bigg Boss / Dance Plus).
500 ms — Kafka producer flush interval.
20,000 — max messages per Kafka producer flush request.
2 s — Spark Streaming micro-batch window.
50 M — concurrent WebSocket connections on the PubSub tier.

Seen in¶

sources/2024-03-26-highscalability-capturing-a-billion-emojions — canonical wiki instance; architecture, design principles, operational numbers, and Voting generalisation.

systems/hotstar-knol — managed-Kafka platform underneath.
systems/hotstar-pubsub — fanout delivery tier.
systems/spark-streaming — stream-processing stage.
systems/kafka — the underlying broker.
concepts/micro-batching — the aggregation model.
concepts/async-write-buffer — the ingest primitive.
concepts/at-most-once-delivery — the chosen ingest semantics.
patterns/async-buffered-kafka-produce — goroutine+channel producer.
patterns/emoji-swarm-realtime-aggregation — the end-to-end pipeline pattern this system is the canonical instance of.