Learn→Data Quality & Validation→Data Quality Dimensions (Accuracy, Completeness, Consistency)→1 of 5
Data Quality & Validation • Data Quality Dimensions (Accuracy, Completeness, Consistency)Easy⏱️ ~2 min
What are Data Quality Dimensions?
Definition
Data Quality Dimensions are measurable properties that determine whether data is fit for its intended use. The three foundational dimensions are accuracy, completeness, and consistency.
✓ In Practice: These dimensions are not abstract theory. Companies like Uber processing 20 billion events per day define explicit Service Level Objectives (SLOs) for each dimension per dataset, treating violations like availability incidents.
💡 Key Takeaways
✓Accuracy means semantic correctness relative to real world truth, not just syntactic validity of data types or formats
✓Completeness measures what percentage of expected records actually arrived, typically tracked per time window and data source
✓Consistency ensures data agrees with itself within tables through constraints and across systems through reconciliation
✓Each dimension requires different enforcement strategies: accuracy at ingestion, completeness through counting, consistency via audits
✓Production systems define explicit SLOs per dimension per dataset, such as 99.9 percent completeness within 30 minutes
📌 Examples
1Accuracy violation: Mobile SDK bug swaps latitude and longitude. Values pass numeric range checks but all locations are wrong by hundreds of kilometers.
2Completeness issue: Expected 2 million events between 10:00 and 10:05 UTC based on historical baseline, but only 1.2 million arrived due to partition lag.
3Consistency failure: User profile shows status as active in cache but suspended in source database, creating contradictory application behavior.