Skip to main content

Command Palette

Search for a command to run...

Understanding Kafka Through a Rapido Ride

One GPS ping. One moving dot. A lot of Kafka in between.

Updated
7 min read
Understanding Kafka Through a Rapido Ride

You open Rapido. You book a ride. A dot appears on your map and it moves. That moving dot feels simple. It is not.

Behind it is a system handling hundreds of thousands of GPS pings every single second, without dropping one. Here's exactly how it works.

First, What Is a Message Stream?

Before we get into the problem, one term worth understanding.

A message stream is a continuous, ordered flow of events. Think of it like a WhatsApp group that never stops. Messages come in one after another, each in order, each with a timestamp. You can scroll up and read old ones. New ones keep arriving. Nobody pauses the chat to let you catch up.

The Obvious Fix

The naive approach: captain's phone sends location → saves to database → your phone fetches it.

Simple. Clean. Completely broken at scale.

At 6 PM on a weekday, Rapido has roughly 100,000 active rides. Every captain's phone sends a GPS ping every 2 seconds. That's 50,000 database writes per second just for location. And that's before billing, notifications, or anything else.

Databases are built to store and query structured data safely. They are not built to absorb a relentless firehose of tiny, high-speed updates. Under that load, the DB chokes. Queries slow down. The whole system falls over.

So engineers put something in the middle.


Kafka — The Central Message Pipeline

Kafka is a message streaming system. Think of it as a massive, organised river of events. Instead of every service shouting directly at the database, they drop messages into this river. Other services sit downstream and pick up what they need at their own pace.

This is what makes Kafka a central message pipeline: every service that generates data flows into it, and every service that needs data reads from it. Nobody talks to anyone else directly. If you add a new service tomorrow, it just subscribes to the existing stream. Zero changes to any producer. Zero disruption to anything already running.

Three roles. That's all:

Producer — any service that creates an event and sends it to Kafka. Captain's phone (via Location Service) is a producer. It fires and forgets. It doesn't care who reads the message.

Kafka — the river. It receives every event, writes it to disk in order, and holds it for 7 days. 10,000 services can read the same message. Nothing gets lost.

Consumer — any service that reads from Kafka and does something with the data. The Live Tracking Service (the one that moves the dot on your screen) is a consumer.


Topics — The Named Channels

Kafka doesn't dump everything into one giant pile. Events are organised into topics. These are named channels where related events live together.

In Rapido's world:

  • captain-location — GPS pings from every active captain, every 2 seconds

  • ride-events — ride created, captain assigned, ride completed

  • payment-events — payment initiated, success, failure, refund

Each topic is its own stream. The Live Tracking Service only subscribes to “captain-location”. It ignores everything else. Billing Service only reads “ride-events”. Clean separation.


Partitions — Splitting One River into Lanes

Here's the problem: “captain-location” receives 50,000 messages per second. One server cannot process that in order. One file cannot handle those writes.

So Kafka splits each topic into partitions — independent lanes, each sitting on a different server, each handling a slice of the load.

The key rule: the same captain always goes to the same partition.

Why does this matter? Because within one partition, messages are strictly ordered. All of Captain 4's pings arrive in sequence — ping 1, ping 2, ping 3. If they scattered across partitions, you'd have no idea which location came first.


Why Kafka Is Fast and Scalable

You might wonder — if Kafka is writing every single event to disk, how is it possibly fast?

Three specific reasons:

Sequential writes. Kafka only ever appends to the end of a file. It never jumps around on disk looking for a free slot. Sequential writes are nearly as fast as writing to RAM. Your hard drive hates random access. Kafka never asks for it.

Zero-copy transfer. When Kafka sends a message to a consumer, it doesn't copy the data through the application's memory. The operating system sends it directly from disk to the network. The CPU barely gets involved. This alone gives Kafka a massive throughput advantage.

Parallel partitions. Each partition sits on a separate server. Reads and writes happen across all partitions simultaneously. With 64 partitions, you get 64x the throughput of a single partition. Need more capacity? Add partitions and broker nodes. The scale is linear. No architectural rewrite needed.

This is why one Kafka cluster handles millions of events per second on normal hardware. It's not magic. It's these three decisions compounding.


Offset — The Bookmark

Every message inside a partition gets a number: offset 0, offset 1, offset 2... This number never changes. Message at offset 5,482 will always be at offset 5,482.

The consumer reads messages one offset at a time. After successfully processing a message, it commits: "I'm done with offset 5,482. Give me 5,483 next time."

If the consumer crashes before committing, Kafka replays from the last committed offset. Nothing is skipped. Nothing is lost.


Consumer Groups — Sharing the Load

One consumer server can't read 50,000 messages per second either. So Kafka lets you create a consumer group — a team of consumers working together.

Each partition gets assigned to exactly one consumer in the group:

All four work in parallel. Together, they handle everything. No consumer reads another's partition. No duplicated work.

And nobody told the other group anything.

A completely separate consumer group — the Analytics Service, counting captains per zone for surge pricing — reads the exact same partitions independently, with its own offset bookmark. It might be 3 messages behind the Live Tracking Service. It doesn't matter. The messages are still there. It catches up at its own pace. Same data, different purpose, zero interference.


The Full Flow — Captain Moves, Your Dot Moves

T = 0s: Captain C-441 moves. Phone sends GPS ping to Location Service.

T = 0.01s: Location Service (Producer) publishes to “captain-location” topic. “captain_id % 4 == 0” → lands in Partition 0 at offset 5,483.

T = 0.05s: Consumer 1 of Live Tracking Service owns Partition 0. It reads offset 5,483. Gets “{ lat: 12.9716, lng: 77.5946 }”.

T = 0.1s: Consumer 1 pushes the coordinate to your phone via WebSocket — a persistent open connection between the server and your app.

T = 0.3s: The dot on your map moves.

T = 0.31s: Consumer 1 commits offset 5,483 to Kafka. Bookmark updated.

Total time from captain moving to your screen updating: under 500ms. No database was touched for location. The DB only gets written to once at ride completion, when billing calculates the fare from the final event.


Why This Doesn't Break

Three guarantees baked into Kafka:

Replication — every partition is copied to 2–3 servers. One server dies, another takes over. No message is lost.

Offset commits — a message is only "done" after the consumer says so. Crash mid-processing, it replays. Nothing is silently skipped.

Retention — messages stay for 7 days regardless of whether anyone read them. If Rapido finds a billing bug next Tuesday, they can replay the entire week's “ride events” and recalculate everything correctly.


The End

Kafka is the reason your captain's dot moves smoothly instead of your app crashing Rapido's servers.

It decouples the people who create data from the people who need it. It handles scale that would kill a database. And it does it while keeping every message safe, ordered, and replayable.

The next time you watch that dot move across the map — that's Kafka, quietly doing its job.


Topics covered: message streams, producer-consumer pattern, topics, partitions, offsets, consumer groups, replication, real-time WebSocket push.