Feature Engineering & Feature StoresFeature Store Architecture (Feast, Tecton, Hopsworks)Medium⏱️ ~2 min

Feature Materialization: Batch, Streaming, and Request Time

Feature computation follows three patterns with distinct latency and cost tradeoffs. Batch materialization runs on a schedule (hourly or daily) to backfill the offline store and upsert the online store. It is cost efficient and reproducible: a daily Spark job processing 500 gigabytes of events to generate 100 million feature rows might cost 20 to 50 dollars on cloud compute. Incremental backfills that only process changes since the last watermark cut compute time by 5 to 10 times and reduce the blast radius of schema errors. Batch is the workhorse for user aggregates, historical trends, and any feature tolerating hours of staleness. Streaming materialization processes events from Kafka or Kinesis in near real time, computing windowed aggregates with event time semantics. Uber's fraud detection and Estimated Time of Arrival (ETA) models rely on features updated within seconds to minutes. The complexity is significant: exactly once semantics require idempotent upserts keyed by entity, window end, and version; late events need watermarks and correction paths; and duplicate events must be deduped to avoid double counting. Streaming infrastructure typically costs 2 to 5 times more than equivalent batch for the same features due to always on clusters and state management. LinkedIn uses Kafka based pipelines for feed ranking features, accepting minutes scale freshness. Request time transforms compute features per inference request from raw signals. Examples include session length, time since last event, or lightweight encodings. This is always fresh and avoids duplicating storage across millions of entities, but adds per request Central Processing Unit (CPU) cost and latency variance. A 2 millisecond request time transform consuming 10 percent of a 50 millisecond Service Level Agreement (SLA) is manageable; a 20 millisecond transform blows your budget. Netflix uses request time enrichments for session context features that cannot be precomputed. The failure mode is late or duplicate events in streaming causing incorrect aggregates. A duplicate click event double counts in a 7 day click count, degrading Precision and Recall. Mitigation uses event time processing with watermarks to handle late arrivals, idempotent upserts that overwrite duplicates, and dedupe buffers based on event Universally Unique Identifiers (UUIDs). Compensating updates can correct past windows when very late events arrive beyond the watermark.
💡 Key Takeaways
Batch materialization runs on schedule (hourly or daily) with hours latency but high cost efficiency; processing 500 gigabytes to generate 100 million feature rows costs 20 to 50 dollars with incremental upserts reducing compute by 5 to 10 times
Streaming materialization via Kafka or Flink achieves seconds to minutes freshness for fraud or Estimated Time of Arrival features, but costs 2 to 5 times more than batch due to always on clusters and exactly once complexity
Request time transforms compute per inference (session length, time deltas) with zero staleness but consume inference latency budget; keep under 1 to 2 milliseconds to avoid blowing 50 millisecond Service Level Agreements
Late and duplicate events in streaming cause incorrect window aggregates (double counting); mitigation requires event time watermarks, idempotent upserts keyed by entity and window end, and dedupe buffers using event Universally Unique Identifiers
Pattern selection: batch for user history with hours staleness tolerance, streaming for real time signals when 2 to 5 times higher cost is justified, request time for per session signals with lightweight compute
📌 Examples
Uber Michelangelo uses Kafka to Flink streaming pipelines to update Estimated Time of Arrival features within seconds, processing millions of trip events per second with idempotent upserts to a Cassandra backed online store
Airbnb Zipline runs daily Spark backfills for search ranking features, computing 7 day user engagement aggregates overnight with partition pruning to process only new data since the last watermark
← Back to Feature Store Architecture (Feast, Tecton, Hopsworks) Overview
Feature Materialization: Batch, Streaming, and Request Time | Feature Store Architecture (Feast, Tecton, Hopsworks) - System Overflow