Kappa vs Lambda vs Batch: Trade offs and Decision Criteria

The Three Architectural Choices

When designing a data platform, you face three main patterns. Pure batch processes data in scheduled windows (hourly, daily) using systems like Spark or MapReduce. Lambda Architecture runs both a batch layer for accuracy and a stream layer for speed, merging results at query time. Kappa Architecture uses only stream processing, replaying history when needed.

Lambda Architecture
Two code paths, eventual consistency between batch and speed layers
vs
Kappa Architecture
One code path, long event retention, reprocess by replay
Kappa vs Lambda: Simplicity vs Infrastructure

Kappa eliminates the core Lambda problem: maintaining two implementations of the same business logic. In Lambda, you write transformation code twice, once in a batch framework (Spark SQL) and once in a stream framework (Flink, Storm). These diverge over time, creating subtle bugs where batch and streaming outputs differ. Debugging becomes painful: is the issue in batch, streaming, or the merge logic?

Kappa solves this with a single transformation written once. However, the trade off is concrete. You must store large volumes of raw events for long periods. For a system ingesting 500,000 events per second at 1 kilobyte each, 90 days of retention requires roughly 3.9 petabytes before compression. Even with 3 to 4x compression, that is over a petabyte of storage. At cloud storage costs of $20 to $30 per terabyte per month, 90 days of history costs $20,000 to $30,000 monthly just for the event log.

You also need stream processing infrastructure that handles both low latency continuous processing and high throughput batch style replay. If your streaming engine can only replay at 1.5x real time speed, reprocessing 90 days takes 60 days, which is impractical. Production Kappa systems typically replay at 5 to 10x real time, requiring beefy clusters.

Kappa vs Pure Batch: Latency vs Simplicity

Pure batch is simpler when you only need daily aggregates and data volumes are small. A nightly Spark job processing 100 gigabytes is straightforward: run, write results, done. No event log, no consumer lag, no state management.

Choose Kappa when latency matters and data naturally arrives as events. Recommender systems, real time fraud detection, operational dashboards: these need sub minute latency. Batch delivers results in hours. Kappa delivers them in seconds, and when logic changes, you reprocess history with the same code. For event driven use cases, this is cleaner than bolting streaming onto a batch system.

"Choose Kappa when all data is events and you need both low latency and reprocessing. Choose Lambda if you have mature batch systems and adding streaming selectively. Choose pure batch if latency over 1 hour is acceptable."
Decision Framework with Numbers

If your use case needs end to end latency under 10 seconds and you occasionally recompute historical results (monthly model updates, quarterly metric recalculations), Kappa is a strong fit. If end to end latency of 1 to 6 hours is acceptable and you rarely reprocess, pure batch is simpler and cheaper.

If you have a large investment in batch infrastructure (years of Hive tables, Airflow DAGs) and need to add real time features incrementally, Lambda might be pragmatic despite the code duplication, because migration costs are high. But for greenfield projects or systems where events are the natural model, Kappa reduces long term maintenance burden.

At extreme scale, teams sometimes blend approaches: Kappa for operational data (orders, clicks) with sub minute latency requirements, and specialized batch systems for deep historical analytics (multi year trend analysis) that can tolerate daily updates.

💡 Key Takeaways

✓Kappa eliminates Lambda's double code paths but requires long retention (90 days at 3.9 PB raw for 500K events/sec) and high throughput replay (5-10x real time)

✓Storage costs are concrete: 1 PB after compression at $20-30/TB/month is $20K-30K/month just for event log retention

✓Choose Kappa when you need sub 10 second latency and periodic reprocessing; choose pure batch when 1-6 hour latency is acceptable and data volumes are small

✓Lambda is pragmatic when migrating from mature batch systems incrementally; Kappa is ideal for greenfield projects where events are the natural data model

📌 Interview Tips

1Recommender system needs 1-2 sec feature updates and monthly model retraining that recomputes 90 days of history: Kappa strong fit.

2Daily revenue report processing 100 GB of transactions with no real time requirements: pure batch is simpler and cheaper.

3Existing data warehouse with years of Hive tables, adding real time dashboards: Lambda allows incremental adoption despite code duplication.

← Back to Kappa Architecture Pattern Overview