Model Monitoring & Observability • Data Quality MonitoringMedium⏱️ ~3 min
Data Contracts and Expectation Based Monitoring
Data contracts shift monitoring from reactive firefighting to proactive agreement enforcement. A contract is a published specification of what a data producer guarantees about shape, freshness, and semantics, coupled with what consumers require to maintain their Service Level Objectives (SLOs). Producers commit to constraints like schema stability, non null requirements on primary keys, referential integrity coverage above 99.9 percent when joining to entity tables, and arrival windows such as hourly partitions complete by minute 10. Consumers codify their needs: PSI thresholds below 0.1 on critical features, required join key coverage, allowed late arrival tolerance, and semantic rules like "if order status is shipped then shipped timestamp must be populated."
This contractual approach makes monitoring deterministic rather than heuristic. Instead of tuning anomaly detectors that flag volume spikes during legitimate traffic surges, you validate explicit promises. A batch feature store processing 100 million to 500 million rows nightly runs column level rules including non null checks on required columns, deduplication rate below 0.1 percent, monotonic increase constraints for cumulative features, and referential integrity validations. These checks run via pushdown aggregations directly in the warehouse, completing in 10 to 20 minutes per 1 billion rows on commodity compute by avoiding data movement.
The trade off is upfront investment in defining and maintaining contracts. Teams must document semantics, negotiate thresholds between producers and consumers, and version contracts as requirements evolve. However, this cost pays dividends during incidents. When a contract violation occurs, ownership is unambiguous and impact is pre calculated through lineage. At Meta scale systems, data stewards per domain own contract definitions and respond to violations, reducing the mean time to assign responsibility from 30 to 45 minutes of detective work down to automatic routing in under 60 seconds.
Contracts also enable safe evolution. A producer wanting to add an optional field or change a data type can preview which downstream contracts would break, run canary deployments on a subset of consumers, and coordinate migrations. Without contracts, schema changes propagate silently until a downstream model sees unexpected nulls and recall drops by 15 to 30 percent.
💡 Key Takeaways
•Producers publish guarantees on schema stability, non null constraints, referential integrity above 99.9 percent, and arrival windows like hourly partitions complete by minute 10
•Consumers codify requirements including PSI thresholds below 0.1 on critical features, join key coverage expectations, late arrival tolerance windows, and conditional semantic rules
•Batch feature stores run column level contract validation in 10 to 20 minutes per 1 billion rows using warehouse pushdown aggregations on checks like deduplication rate below 0.1 percent
•Contract violations trigger automatic owner routing via lineage mapping, reducing mean time to assign from 30 to 45 minutes of investigation to under 60 seconds at Meta scale
•Contracts enable safe schema evolution by previewing downstream breakage, running canary deployments on subsets, and coordinating migrations instead of silent propagation
•Upfront investment in defining and versioning contracts trades initial cost for deterministic monitoring that avoids false positives from heuristic anomaly detection during legitimate traffic changes
📌 Examples
Airbnb feature store contract: producer guarantees user_id non null and unique, location_id referential integrity above 99.9 percent, and daily partition arrival by 06:00 UTC; consumer pricing model requires PSI below 0.1 on price_per_night and availability_30d features
Meta streaming feature pipeline contract: producer commits to event lag p95 under 120 seconds and schema version stability for 7 days; consumer feed ranking model requires missingness below 1 percent on engagement_score and allows 5 minute late arrival tolerance
Netflix batch feature computation: validates monotonic increase on cumulative_watch_minutes, deduplication rate below 0.1 percent on (user_id, content_id, timestamp) composite key, runs full validation in 15 minutes on 500 million rows using Spark pushdown