Implementation Details and Production Patterns

Defining the Micro-batch Boundary:
Implementation starts with choosing how to define batch boundaries. There are two fundamental strategies, and most production systems use both. Time based triggers create a batch every N seconds. Size based triggers create a batch per M records or per fixed data volume. The hybrid approach uses "every 5 seconds or 100,000 records, whichever comes first."

This dual strategy prevents two failure modes. Pure time based batching can create enormous batches during traffic spikes, overwhelming workers and downstream sinks. Pure size based batching can create excessive latency during quiet periods: if traffic drops to 1,000 events per second but batch size is 100,000 records, you wait 100 seconds for the next batch.

1
Calculate offsets: Driver computes source offsets defining this batch (start and end offset per Kafka partition)
2
Read and transform: Workers read exactly that slice from Kafka, apply transformations, and compute aggregates
3
Write to sinks: All outputs are written to downstream systems (analytics stores, feature stores, warehouses)
4
Checkpoint commit: Once all sinks confirm, advance checkpoint. On failure, replay this exact slice
Handling Stateful Computations:
Stateful operations like windowed aggregations or joins across batches require a state store. In micro-batching, state is updated at batch boundaries rather than per event. Consider a 5 minute tumbling window with 5 second micro-batches. Each batch updates partial aggregates for its window. At window close (after 60 micro-batches), emit the final result.

The state store can be in-memory with periodic snapshots to durable storage, or external like RocksDB or a distributed key-value store. The critical property is deterministic recomputation: given the same batch slice and prior state, you must produce the same output. This enables exactly-once semantics via replay.

For a concrete example, imagine computing hourly click-through rates. State is a map of campaign_id to (impressions, clicks). Each micro-batch reads events, groups by campaign, increments counters, and updates the state store. Every 720 micro-batches (one hour at 5 second intervals), compute ratios and emit results.

Capacity Planning and Autoscaling:
Production systems size clusters based on expected throughput and target headroom. Start with max expected throughput, for example 200,000 events per second across all partitions. Choose target batch interval (5 seconds) and processing time ratio (processing should be at most 50 percent of interval, giving 2.5 seconds).

With 200,000 events per second, each batch contains 1 million events. If each worker processes 50,000 events per second, you need at least 20 workers to process 1 million events in 2.5 seconds. Add 30 to 50 percent headroom for traffic spikes, bringing the cluster to 26 to 30 workers.

Cluster Sizing Example
1M
EVENTS/BATCH
2.5s
TARGET TIME
26-30
WORKERS

Autoscaling policies monitor metrics like "micro-batch processing time," "input records per batch," and "number of queued batches." If processing time exceeds 60 percent of interval for three consecutive batches, add workers. If processing time drops below 30 percent for 10 minutes, remove workers to save cost.

Integration with Downstream Systems:
For data warehouses or lakehouses, each micro-batch often becomes a single atomic append or transaction. With Delta Lake or Iceberg, you write batch results as a new file, then commit a transaction that makes the file visible. This provides snapshot isolation: queries see consistent state, and failures leave no partial writes.

For online key-value stores serving ML features, design keys to make batch upserts idempotent. Include a version number or event timestamp with each write. The store keeps the newest version. If a batch is replayed, duplicate writes either update to the same value (idempotent) or are rejected because the version is old.

✓ In Practice: Production implementations include per-batch latency histograms, error rates by sink, lag from source event timestamps to processing time, and alerts when backlog exceeds 10 batches or watermark delays exceed SLO.

Observability is critical. Track per-batch latency as a histogram, broken down by stage: read from source, transformation, write to each sink, checkpoint commit. Monitor error rates by sink type. Measure lag from source event timestamps to processing completion. Alert when batch backlog exceeds 10 batches (indicating sustained overload) or when watermark delays exceed your SLO threshold. These metrics are exactly what interviewers expect you to discuss when designing micro-batch systems at scale.

💡 Key Takeaways

✓Hybrid batch boundaries use "every N seconds or M records, whichever comes first" to prevent enormous batches during spikes and excessive latency during quiet periods

✓Stateful computations update state at batch boundaries: a 5 minute window with 5 second batches requires 60 state updates, one per micro-batch, with final emission after 60 batches complete

✓Capacity planning sizes clusters for target processing time at 50 percent of batch interval: 200k events/sec with 5s interval and 50k events/sec per worker requires 20 workers plus 30 to 50 percent headroom

✓Autoscaling monitors processing time to interval ratio, adding workers when ratio exceeds 60 percent for three consecutive batches, removing workers when ratio stays below 30 percent for 10 minutes

✓Integration with warehouses makes each micro-batch an atomic transaction (Delta Lake, Iceberg), while key-value stores use versioned keys to ensure idempotent upserts during batch replay

📌 Interview Tips

1A log analytics pipeline uses dual triggers: "every 10 seconds or 500,000 log lines." During peak traffic (100k lines/sec), batches form every 5 seconds. During off-peak (5k lines/sec), batches form every 10 seconds, preventing excessive latency.

2An ML feature pipeline maintains state for 30 minute sliding windows using 5 second micro-batches. For user click features, each batch updates a RocksDB map keyed by user_id, storing arrays of recent click timestamps. Every batch scans this map to emit updated feature vectors.

3A ride sharing surge pricing system sizes its cluster for 50,000 events/sec peak with 3 second batches. Target processing time is 1.5 seconds, requiring 10 workers at 15k events/sec/worker capacity. Cluster runs 15 workers (50 percent headroom) and autoscales to 20 workers when processing time exceeds 1.8 seconds.

← Back to Micro-batching Techniques Overview