Feature Engineering & Feature Stores • Feature Sharing & DiscoveryMedium⏱️ ~3 min
Feature Sharing & Discovery: The Dual-Plane Architecture
Feature sharing and discovery solves a fundamental ML platform problem: hundreds of models across different teams need consistent, low latency access to thousands of features without rebuilding them from scratch. The solution is a dual plane architecture with a registry that acts as the central nervous system.
The offline plane computes and stores large scale, point in time correct feature snapshots for training. Think terabyte scale datasets with time travel joins that prevent data leakage. Netflix Zipline processes daily training sets at TB scale with multi month backfills. The online plane serves low latency feature values for inference, typically targeting 5 to 20 milliseconds at p95 to fit within sub 100 millisecond end to end prediction budgets. Uber Michelangelo handles millions of events per minute with streaming updates and sub minute freshness.
The feature registry binds these planes together. It stores canonical definitions, entity keys, data lineage, quality signals like null rates and drift scores, owners, and usage statistics. Discovery is not just search; LinkedIn Feathr ranks features by usage frequency, model performance attribution, freshness adherence, and stability. Teams can evaluate whether a feature is fit for purpose before committing to it. The registry enforces training serving parity: the same transformation logic and data contracts apply in both batch and real time paths to prevent the silent accuracy degradation from skew.
Production systems report 30 to 70 percent feature reuse rates across models, cutting model onboarding from weeks to days. Airbnb surfaces quality metrics and example notebooks in their discovery portal. The operational contract is strict: online fetches must scale to tens of thousands to millions of queries per second across tenants while maintaining single digit to low tens of milliseconds p95 latency.
💡 Key Takeaways
•Dual plane architecture separates offline training (TB scale, point in time correct) from online serving (5 to 20ms p95, 10K to 1M QPS) with registry enforcing consistency across both planes
•Feature registry is not just storage but active governance: ranks by usage and quality, surfaces null rates and drift scores, enforces training serving parity to prevent skew
•Production systems achieve 30 to 70 percent reuse rates, cutting model onboarding from weeks to days at Netflix, Uber, LinkedIn, and Airbnb
•Online serving constraint drives architecture: single digit to low tens of milliseconds p95 latency within sub 100ms end to end inference budgets requires pre materialization and aggressive caching
•Point in time correctness is mandatory: offline joins use event timestamps to prevent data leakage, same transformation logic in batch and streaming paths prevents silent accuracy drops
•Scale envelope: thousands of features, hundreds of models, millions of events per minute for streaming updates, multi month historical backfills at TB to PB scale
📌 Examples
Netflix Zipline manages thousands of features used by hundreds of personalization models, processes daily TB scale training sets with multi month backfills, maintains single digit to low tens of milliseconds p95 for online retrieval
Uber Michelangelo ingests millions of events per minute for ETA and pricing models, achieves 5 to 20ms p95 online lookups, generates multi TB training sets with point in time joins to prevent leakage
LinkedIn Feathr reduces time to production from weeks to days by ranking features by usage frequency and model performance attribution, integrates with Venice for single digit millisecond online reads
Airbnb Bighead targets sub 100ms end to end inference for search ranking, allocates low tens of milliseconds p95 to feature retrieval via pre materialized stores and request coalescing