Dual Store Architecture: Offline and Online Feature Stores
Offline Store
Lives in your data lake or warehouse and handles petabyte scale historical data with columnar formats optimized for batch processing. It serves point in time joins, backfills, and dataset generation where a typical 100 million example backfill might scan 5 terabytes of feature history over hours using 100 to 200 vCPUs.
Online Store
A region local key value database optimized for sub 10 millisecond reads at high QPS. Netflix serves millions of feature reads per second with sub millisecond p50 latencies using EVCache (an in memory cache tier). Uber's online store built on Cassandra delivers tens of milliseconds p99 at peak with millions of entities. The key is co location: keeping feature serving in the same AZ as your model servers to avoid cross AZ network hops that add 5 to 15ms.
Metadata Registry
Ties both stores together with entity definitions, feature schemas, lineage tracking, and version control. When you define a feature group, the same transformation logic generates both offline training datasets and online serving values. This solves the training serving parity problem: the offline AUC you see during training matches online performance because features come from identical pipelines.
The Complexity Trade-off
You maintain two copies of your data and two materialization pipelines. Distribution drift can occur if offline backfills run without updating the online store. For low QPS systems under 100 requests per second that tolerate 1 to 5 minute staleness, a single data warehouse with on demand transforms may be simpler and cut infrastructure costs by 50 to 70 percent.