Definition
Lambda Architecture is a data processing pattern that combines batch and stream processing to achieve both low latency and high correctness in large scale systems.
Imagine your analytics team wants to answer: "how many rides happened today?" They need the answer to be fast (updated within seconds) and correct (every single ride counted, no duplicates). Pure streaming gives you speed but makes corrections hard when bugs appear. Pure batch processing gives you correctness but takes 15 minutes to hours to refresh. Lambda Architecture solves this by running both approaches simultaneously.
The Core Problem:
Consider a ride sharing app processing 5 million trips per day. Product managers need real time fraud alerts within 2 seconds. Finance needs trip revenue calculations that are correct to the cent for payouts. Marketing wants 6 months of historical data for cohort analysis. No single processing approach handles all three well.
Batch processing can recompute everything from scratch, making it easy to fix bugs by rerunning jobs over complete historical data. But it's slow: processing a day of data might take 30 to 60 minutes. Stream processing updates results as each event arrives, giving sub second latency. But if you discover a calculation bug three months later, correcting historical stream results is complex and error prone.
The Lambda Solution:
Lambda Architecture uses three layers. The
batch layer stores all raw events in immutable storage and periodically recomputes views from scratch, like daily aggregates or feature tables. This runs every 15 minutes to 24 hours depending on requirements. The
speed layer processes the same events in real time, maintaining incremental views over recent data, typically the last few seconds to hours. The
serving layer merges both: it returns batch results for historical queries and speed results for recent data.
"Correctness comes from the batch layer, while low latency comes from the speed layer. You get both by running them in parallel."
For that "rides today" query, the serving layer returns: rides from midnight to 6am (from last batch run, guaranteed correct) plus rides from 6am to now (from speed layer, updated every second). Users see fresh data with a correctness guarantee.
✓Lambda Architecture combines batch processing (high correctness, high latency) with stream processing (low latency, incremental updates) to get benefits of both approaches
✓The batch layer stores immutable raw events and periodically recomputes views from scratch, running every 15 minutes to 24 hours depending on business needs
✓The speed layer processes events in real time with p50 latency typically under 1 second and p99 under 5 to 30 seconds, maintaining only recent state
✓The serving layer merges results: historical data from batch views plus fresh data from speed views, presenting a unified interface to queries
1A ride sharing app tracks 5 million trips per day: batch layer computes daily revenue aggregates running nightly, speed layer tracks active trips and fraud scores updated every second, serving layer combines both for dashboards
2An ecommerce platform processes 200 thousand events per second at peak: batch path writes hourly partitions to data lake and recomputes customer lifetime value daily, speed path maintains real time inventory counts and cart activity