Feature Engineering & Feature StoresFeature Store Architecture (Feast, Tecton, Hopsworks)Medium⏱️ ~2 min

Dual Store Architecture: Offline and Online Feature Stores

Feature stores use a dual store pattern that separates storage into two complementary systems: an offline store for historical training data and an online store for low latency inference. The offline store lives in your data lake or warehouse and handles petabyte scale historical data with columnar formats optimized for batch processing. It serves point in time joins, backfills, and dataset generation where a typical 100 million example backfill might scan 5 terabytes of feature history over hours using 100 to 200 virtual Central Processing Units (vCPUs). The online store is a region local key value database optimized for sub 10 millisecond reads at high Queries Per Second (QPS). Netflix serves millions of feature reads per second with sub millisecond p50 latencies using EVCache (an in memory cache tier). Uber's online store built on Cassandra delivers tens of milliseconds p99 at peak with millions of entities across drivers, trips, and users. The key is co location: keeping feature serving in the same Availability Zone (AZ) as your model servers to avoid cross AZ network hops that add 5 to 15 milliseconds. A metadata registry ties both stores together with entity definitions, feature schemas, lineage tracking, and version control. When you define a feature group, the same transformation logic generates both offline training datasets and online serving values. This solves the training serving parity problem: the offline Area Under the Curve (AUC) you see during training matches online performance because features come from identical pipelines. The tradeoff is operational complexity. You maintain two copies of your data and two materialization pipelines. Distribution drift can occur if offline backfills run without updating the online store, causing model quality drops. For low QPS systems under 100 requests per second that tolerate 1 to 5 minute staleness, a single data warehouse with on demand transforms may be simpler and cut infrastructure costs by 50 to 70 percent.
💡 Key Takeaways
Offline store uses columnar lake formats for petabyte scale historical data, serving point in time joins and backfills that scan terabytes over hours using hundreds of vCPUs
Online store targets sub 10 millisecond p99 reads from region local key value databases, with Netflix achieving sub millisecond p50 at millions of QPS using in memory caching
Metadata registry enforces training serving parity by ensuring the same transformation logic generates both offline training datasets and online serving values
Co location matters: placing feature serving in the same Availability Zone as model servers avoids 5 to 15 millisecond cross AZ network penalties
Operational tradeoff: maintaining two storage systems and materialization pipelines introduces drift risk and complexity; single warehouse systems may suffice for low QPS under 100 requests per second with 1 to 5 minute staleness tolerance
📌 Examples
Airbnb Zipline stores offline features in their data lake with Spark based backfills and point in time joins, while materializing to a Redis like key value store for single digit millisecond p99 online reads
Uber Michelangelo processes millions of events per second from Kafka into Flink streaming pipelines, writing to a Cassandra backed online store with tens of milliseconds p99 latency at peak traffic
← Back to Feature Store Architecture (Feast, Tecton, Hopsworks) Overview
Dual Store Architecture: Offline and Online Feature Stores | Feature Store Architecture (Feast, Tecton, Hopsworks) - System Overflow