Online Feature Store Architecture for Sub 10ms Reads
Why Dedicated Online Stores
General-purpose databases add latency: query parsing, transaction overhead, index traversal. Online feature stores optimize for a single access pattern: given an entity key, return all features for that entity. This specialization enables sub-10ms reads even with hundreds of features per entity.
Architecture Pattern: The online store is a read-optimized cache populated by batch or streaming pipelines. Features are pre-computed offline and written to the store. At serving time, only key-value lookups occur—no computation, no joins, no aggregation.
Storage Layer Options
Redis provides sub-millisecond reads with in-memory storage. DynamoDB offers durability with single-digit millisecond latency. Cassandra scales to billions of keys with tunable consistency. The choice depends on data volume, durability requirements, and cost tolerance. Most systems use Redis for hot data with a persistent backing store.
Multi-Get Optimization
Fetching 100 features with 100 individual requests takes 50-100ms (network round-trips dominate). Multi-get fetches all features in a single round-trip: 1-5ms total. The client sends a list of keys; the store returns all values together. This optimization is critical for latency.
Production Tip: Colocate all features for an entity in a single key-value pair (serialized blob). This guarantees single-key retrieval regardless of feature count, eliminating multi-get overhead entirely.
Feature Freshness
Pre-computed features become stale. User activity in the last minute is not reflected in hourly-updated features. Solutions: streaming pipelines for near-real-time updates (seconds of delay), or hybrid approaches combining pre-computed baseline features with real-time computed recent activity signals.