What is Stateful Stream Processing?

Definition
Stateful stream processing is a technique for processing unbounded data streams where the system maintains and updates context (state) across multiple events, enabling decisions based on historical or aggregated information rather than just the current event.
The Core Problem:
Many real world decisions cannot be made by looking at a single event in isolation. Fraud detection systems need to know "how many transactions has this card made in the last 5 minutes?" Recommendation engines need to track "what did this user click in this session?" Analytics dashboards need to compute "errors per service per minute." None of these questions can be answered by processing one event at a time without memory.

Stateless vs Stateful:
Stateless processing treats each event independently. You receive an event, transform it, and emit it without remembering anything. This works great for simple filtering or format conversion at 1 to 10 million events per second.

Stateful processing maintains context. The system keeps data structures in memory and on disk that track information across events. When a payment event arrives, the processor looks up "how many payments has user 12345 made in the last minute?" updates that counter, and uses the result to decide whether to flag the transaction.

How State is Organized:
State is partitioned by key and colocated with processing. Events for user 12345 always go to the same processor instance, which maintains state for that user locally. This avoids network calls for every state lookup. The system stores this state in embedded key value stores backed by local Solid State Drives (SSDs), with memory caches on top for fast access, typically 1 to 3 milliseconds per read or write.

✓ In Practice: A fraud detection system might process 100 thousand card swipe events per second, maintaining per card histories over 5 minute and 24 hour windows. Without stateful processing, you would need to query a database for every transaction, adding 10 to 50 milliseconds of latency and overwhelming the database with queries.
State Guarantees:
The engine must ensure state changes are consistent with event processing, even when machines fail. This means guaranteeing exactly once or effectively once semantics. If a processor crashes after updating state but before checkpointing, the system must either roll back the state update or ensure it does not get duplicated on restart.

💡 Key Takeaways

✓Stateful processing maintains context across events, enabling decisions based on historical patterns rather than single events in isolation

✓State is partitioned by key and stored locally with the processor (1 to 3 milliseconds access time) to avoid network overhead for every lookup

✓The system guarantees consistency between state updates and event processing, ensuring exactly once or effectively once semantics even during failures

✓Common use cases include fraud detection (tracking transaction history), sessionization (tracking user activity), and real time metrics (computing aggregates over time windows)

📌 Interview Tips

1Fraud detection: track that card 1234 made 4 payments in last minute, flag 5th payment if threshold exceeded

2Sessionization: maintain last click time for user, end session if 30 minutes of inactivity detected

3Real time metrics: compute errors per service per minute by maintaining counters that reset every 60 seconds

← Back to Stateful Stream Processing Overview