Feature Engineering & Feature StoresFeature Store Architecture (Feast, Tecton, Hopsworks)Medium⏱️ ~2 min

Feature Materialization: Batch, Streaming, and Request Time

Batch Materialization

Runs on a schedule (hourly or daily) to backfill the offline store and upsert the online store. It is cost efficient and reproducible: a daily Spark job processing 500 gigabytes of events to generate 100 million feature rows might cost 20 to 50 dollars on cloud compute. Incremental backfills that only process changes since the last watermark cut compute time by 5 to 10x and reduce the blast radius of schema errors. Batch is the workhorse for user aggregates, historical trends, and any feature tolerating hours of staleness.

Streaming Materialization

Processes events from Kafka or Kinesis in near real time, computing windowed aggregates with event time semantics. Uber's fraud detection and ETA models rely on features updated within seconds to minutes. The complexity is significant: exactly once semantics require idempotent upserts keyed by entity, window end, and version; late events need watermarks and correction paths; and duplicate events must be deduped. Streaming infrastructure typically costs 2 to 5x more than equivalent batch for the same features due to always on clusters and state management.

Request Time Transforms

Compute features per inference request from raw signals. Examples include session length, time since last event, or lightweight encodings. This is always fresh and avoids duplicating storage across millions of entities, but adds per request CPU cost and latency variance. A 2ms request time transform consuming 10 percent of a 50ms SLA is manageable; a 20ms transform blows your budget. Netflix uses request time enrichments for session context features that cannot be precomputed.

Late and Duplicate Events

The failure mode in streaming causing incorrect aggregates. A duplicate click event double counts in a 7 day click count, degrading precision and recall. Mitigation uses event time processing with watermarks to handle late arrivals, idempotent upserts that overwrite duplicates, and dedupe buffers based on event UUIDs. Compensating updates can correct past windows when very late events arrive beyond the watermark.

💡 Key Takeaways
Batch materialization runs on schedule (hourly or daily) with hours latency but high cost efficiency; processing 500 gigabytes to generate 100 million feature rows costs 20 to 50 dollars with incremental upserts reducing compute by 5 to 10 times
Streaming materialization via Kafka or Flink achieves seconds to minutes freshness for fraud or Estimated Time of Arrival features, but costs 2 to 5 times more than batch due to always on clusters and exactly once complexity
Request time transforms compute per inference (session length, time deltas) with zero staleness but consume inference latency budget; keep under 1 to 2 milliseconds to avoid blowing 50 millisecond Service Level Agreements
Late and duplicate events in streaming cause incorrect window aggregates (double counting); mitigation requires event time watermarks, idempotent upserts keyed by entity and window end, and dedupe buffers using event Universally Unique Identifiers
Pattern selection: batch for user history with hours staleness tolerance, streaming for real time signals when 2 to 5 times higher cost is justified, request time for per session signals with lightweight compute
📌 Interview Tips
1Uber Michelangelo uses Kafka to Flink streaming pipelines to update Estimated Time of Arrival features within seconds, processing millions of trip events per second with idempotent upserts to a Cassandra backed online store
2Airbnb Zipline runs daily Spark backfills for search ranking features, computing 7 day user engagement aggregates overnight with partition pruning to process only new data since the last watermark
← Back to Feature Store Architecture (Feast, Tecton, Hopsworks) Overview