Learn→Time Series Forecasting→Real-time Updates (Online Learning, Sliding Windows)→2 of 6

Time Series Forecasting • Real-time Updates (Online Learning, Sliding Windows)Medium⏱️ ~3 min

Time Bucketing: Efficient Sliding Window Implementation

Time Bucketing: An implementation technique that groups events into discrete time intervals (buckets) and aggregates within each bucket. Instead of tracking individual events, you maintain per-bucket summaries, reducing memory from O(events) to O(buckets) while approximating true sliding window semantics.
The Implementation Challenge
True sliding windows require storing every event to compute exact aggregates. A 1-hour window at 1000 events/second means 3.6 million events in memory per user. For features like sum, count, or average, time bucketing achieves near-equivalent results with fixed memory. Instead of storing 3.6M events, store 60 one-minute buckets. Each bucket holds pre-aggregated values: count, sum, min, max. Memory drops from megabytes to hundreds of bytes.
Bucket Granularity Trade-offs
Smaller buckets (1 second) give more accurate sliding windows but require more storage and computation. Larger buckets (1 minute) use less memory but introduce approximation error. The error manifests at window boundaries: if current time is 10:07:30 and you want a 5-minute window, the bucket starting at 10:02:00 is only half inside your true window. Common approach: use bucket size at most 1/10th of window size. For a 1-hour window, use 6-minute buckets (10 buckets); for real-time features, 1-second buckets for 10-second windows.
Partial Bucket Handling
When computing aggregates, the oldest bucket typically overlaps partially with the window boundary. Two strategies: Exclude partial: Only use complete buckets inside the window. Simpler but window size varies (4.5-5.5 minutes for a 5-minute window with 1-minute buckets). Linear interpolation: If 30% of the boundary bucket falls inside the window, include 30% of its aggregate. More accurate but requires storing bucket boundaries. For ML features, interpolation is usually worth the complexity—sudden jumps in features when buckets expire can confuse models.
Memory Formula: For window W with bucket size B, you need ceil(W/B) + 1 buckets (extra bucket for partial overlap). A 1-hour window with 1-minute buckets needs 61 buckets. At 32 bytes per bucket (count, sum, min, max, timestamp), that is under 2KB per entity.

💡 Key Takeaways

✓Time bucketing reduces memory from O(events) to O(buckets) with bounded approximation error

✓Bucket size should be at most 1/10th of window size for acceptable accuracy

✓Linear interpolation for partial buckets prevents sudden feature jumps at bucket boundaries

📌 Interview Tips

11-hour window with 6-minute buckets needs 11 buckets (about 350 bytes)

2Without bucketing, 1-hour at 1000 events/sec requires 3.6M events in memory

← Back to Real-time Updates (Online Learning, Sliding Windows) Overview