Batch vs Stream Processing • Kappa Architecture PatternEasy⏱️ ~3 min
What is Kappa Architecture?
Definition
Kappa Architecture is a data processing pattern where ALL data flows through a single stream processing system, using an immutable event log as the source of truth for both real time processing and historical reprocessing.
💡 Key Takeaways
✓Single stream processing layer handles both real time and historical data, eliminating the need for separate batch and streaming code paths
✓Immutable event log with long retention (typically 30 to 180 days) acts as the source of truth for all processing
✓Reprocessing is done by replaying the event log from the beginning with the same streaming code, often at 3 to 5 times real time throughput
✓Materialized views (derived data stores) are considered disposable and can be rebuilt by replaying the log whenever business logic changes
📌 Interview Tips
1E commerce platform writes 200,000 events/sec to central log. Streaming job builds user profiles for recommendations within 1 to 2 seconds. When deploying new model, replay 90 days of history at 5x speed to rebuild profiles.
2Fraud detection system consumes transaction events with p99 latency under 2 seconds from ingestion to score. When rules change, replay historical events to recompute risk scores without writing separate batch job.