What is Micro-batching?

Definition
Micro-batching is a data processing pattern that groups streaming events into very small batches, typically every 1 to 10 seconds, then processes each batch as a miniature batch job rather than handling events individually.
The Problem It Solves:
Many systems need data fresher than what hourly or daily batch processing can provide, but they don't actually need true per-event streaming with sub-second latency. For example, a monitoring dashboard showing ad campaign metrics doesn't need millisecond updates. A 5 to 10 second delay is perfectly acceptable. But waiting an hour for the next batch run is too slow.

Micro-batching occupies this middle ground. It treats an unbounded event stream as a sequence of tiny, bounded datasets. Instead of processing each event as it arrives, the system waits a few seconds, collects thousands of events, and processes them together.

How It Works:
For each micro-batch interval, the system performs four key steps. First, read events from an ingress buffer like Kafka or Kinesis. Second, transform and aggregate them using batch-oriented logic. Third, write results to sinks such as data warehouses, feature stores, or databases. Fourth, checkpoint progress so failures can be replayed.

The key insight is that fault tolerance and exactly-once semantics are much easier when each unit of work is a discrete, replayable batch. You can checkpoint source offsets and batch outputs, then rerun a failed micro-batch deterministically. With per-event streaming, you need more complex per-record idempotency protocols.

Typical Latency Range
5-10s
P50 LATENCY
15-20s
P99 UNDER LOAD

Frameworks like Spark Streaming use this pattern extensively. They operate with batch intervals ranging from sub-second to tens of seconds, delivering end-to-end latencies measured in seconds rather than milliseconds. This is good enough for many analytics, monitoring, and machine learning feature use cases, at a much lower complexity level than record-by-record stream processing.

💡 Key Takeaways

✓Micro-batching groups streaming events into small time windows (1 to 10 seconds) and processes them as discrete batch jobs

✓It provides near real-time latency (5 to 20 seconds) while keeping the simpler programming model and fault tolerance semantics of batch processing

✓Each micro-batch reads from source offsets, transforms data, writes to sinks, and checkpoints progress for deterministic replay on failure

✓Typical production systems achieve 7 to 8 second p50 latency and 15 to 20 second p99 latency with 5 second batch intervals

📌 Interview Tips

1An ad platform processes 200,000 impression events per second by pulling from Kafka every 5 seconds (1 million events per batch), computing per-campaign aggregates, and writing to both a real-time dashboard store and a feature store for ML models

2A monitoring pipeline collects application logs every 3 seconds, aggregates error rates by service, and updates alert thresholds, providing dashboard updates with 5 to 8 second end-to-end latency

3A recommendation system recomputes user preference features every 10 seconds from click events, writing updated feature vectors to an online serving layer that feeds real-time models

← Back to Micro-batching Techniques Overview