Feature Engineering & Feature StoresFeature Store Architecture (Feast, Tecton, Hopsworks)Medium⏱️ ~2 min

Dual Store Architecture: Offline and Online Feature Stores

Definition
Feature stores use a dual store pattern that separates storage into two complementary systems: an offline store for historical training data (petabyte scale, columnar formats, batch processing) and an online store for low latency inference (sub 10ms reads at high QPS).

Offline Store

Lives in your data lake or warehouse and handles petabyte scale historical data with columnar formats optimized for batch processing. It serves point in time joins, backfills, and dataset generation where a typical 100 million example backfill might scan 5 terabytes of feature history over hours using 100 to 200 vCPUs.

Online Store

A region local key value database optimized for sub 10 millisecond reads at high QPS. Netflix serves millions of feature reads per second with sub millisecond p50 latencies using EVCache (an in memory cache tier). Uber's online store built on Cassandra delivers tens of milliseconds p99 at peak with millions of entities. The key is co location: keeping feature serving in the same AZ as your model servers to avoid cross AZ network hops that add 5 to 15ms.

Metadata Registry

Ties both stores together with entity definitions, feature schemas, lineage tracking, and version control. When you define a feature group, the same transformation logic generates both offline training datasets and online serving values. This solves the training serving parity problem: the offline AUC you see during training matches online performance because features come from identical pipelines.

The Complexity Trade-off

You maintain two copies of your data and two materialization pipelines. Distribution drift can occur if offline backfills run without updating the online store. For low QPS systems under 100 requests per second that tolerate 1 to 5 minute staleness, a single data warehouse with on demand transforms may be simpler and cut infrastructure costs by 50 to 70 percent.

💡 Key Takeaways
Offline store uses columnar lake formats for petabyte scale historical data, serving point in time joins and backfills that scan terabytes over hours using hundreds of vCPUs
Online store targets sub 10 millisecond p99 reads from region local key value databases, with Netflix achieving sub millisecond p50 at millions of QPS using in memory caching
Metadata registry enforces training serving parity by ensuring the same transformation logic generates both offline training datasets and online serving values
Co location matters: placing feature serving in the same Availability Zone as model servers avoids 5 to 15 millisecond cross AZ network penalties
Operational tradeoff: maintaining two storage systems and materialization pipelines introduces drift risk and complexity; single warehouse systems may suffice for low QPS under 100 requests per second with 1 to 5 minute staleness tolerance
📌 Interview Tips
1Airbnb Zipline stores offline features in their data lake with Spark based backfills and point in time joins, while materializing to a Redis like key value store for single digit millisecond p99 online reads
2Uber Michelangelo processes millions of events per second from Kafka into Flink streaming pipelines, writing to a Cassandra backed online store with tens of milliseconds p99 latency at peak traffic
← Back to Feature Store Architecture (Feast, Tecton, Hopsworks) Overview