Data Products: The Core Building Block

What Makes It a Product:
A data product is not just a table or dataset. It is an architectural quantum that bundles code, data, metadata, and infrastructure into an independently deployable unit. Think of it like a microservice, but for analytical data.

The code includes ingestion logic that pulls from operational systems, transformation pipelines that clean and shape the data, access enforcement that restricts who can query what, and automated quality checks that validate freshness and accuracy. The data itself is the stored analytical representation, typically partitioned tables, time series streams, or aggregated views. Metadata includes schemas registered in a central catalog, lineage showing where data originates and flows to, data dictionaries explaining field meanings, and SLOs that define guarantees like "data freshness under 10 minutes at p95" or "null rate below 1%."

Contract Driven Design:
Each data product exposes a clear contract. When the Payments domain publishes a Payment Failed Events product, the contract specifies the schema with exact field types, the update frequency (for example, near real time with under 5 minute lag), the data retention policy (perhaps 90 days for raw events, 2 years for aggregates), and access requirements (PII fields are masked unless you have specific approval). Consumers can discover this contract in the catalog and depend on it.

This is fundamentally different from a traditional data warehouse where a central team might change a column name or data type without warning, breaking downstream reports. With data products, breaking changes require versioning. You publish Payment Failed Events v2 alongside v1, give consumers time to migrate, then deprecate the old version on a defined schedule.

Typical Production Metrics
1-3 sec
P50 QUERY LATENCY
< 10 sec
P99 QUERY LATENCY
< 10 min
DATA FRESHNESS P95
Observability and Quality:
Every data product should expose operational metrics. Ingestion lag tells you how far behind real time you are. If the Orders domain processes 20,000 events per second and your ingestion lag spikes to 30 minutes, something is wrong. Query latency metrics (p50, p99) show how responsive the product is for analysts. Data quality indicators track null rates, duplicate rates, and schema violations.

Usage metrics are equally important. How many distinct consumers query this product daily? How many queries per day? If a product has zero consumers for 90 days, it is a candidate for deprecation, freeing up resources. At Zalando scale with over 200 data products, without this observability you would have no idea which products are critical and which are abandoned.

Infrastructure Automation:
The self serve platform automates infrastructure provisioning. When a domain creates a new data product, they declare requirements at a high level: "I need a partitioned table, updated every 5 minutes from this event stream, with PII fields tagged." The platform automatically provisions storage buckets, configures encryption at rest, sets up role based access control, registers schemas, applies retention policies, and creates monitoring dashboards. Domain teams do not manually configure security groups or storage quotas.

This automation is what makes data products scalable. Without it, each domain would need deep infrastructure expertise, and setup time would balloon from minutes to weeks.

💡 Key Takeaways

✓A data product bundles code, data, metadata, and infrastructure into an independently deployable unit with clear ownership and SLOs

✓Contracts specify schema, update frequency, retention policy, and access requirements. Breaking changes require versioning (v1 alongside v2) with migration periods

✓Typical query latency for a single domain product is 1 to 3 seconds at p50, under 10 seconds at p99 at moderate concurrency

✓Every product exposes metrics: ingestion lag, query latency, data quality indicators (null rate, duplicate rate), and usage statistics (distinct consumers, queries per day)

✓The self serve platform automates infrastructure provisioning (storage, encryption, access control, monitoring), reducing setup time from weeks to minutes

📌 Interview Tips

1The Payments domain publishes Payment Failed Events v1 with schema including <code>transaction_id</code>, <code>failure_reason</code>, <code>timestamp</code>. SLOs guarantee under 5 minute lag and null rate below 1%. PII fields like <code>customer_email</code> are automatically masked unless consumer has approval.

2At Zalando with over 200 data products, observability metrics identify that a legacy Customer Segmentation product has zero consumers for 90 days, triggering deprecation to free resources

3When Orders domain processes 20k events per second and ingestion lag spikes to 30 minutes (versus SLO of under 10 minutes), automated alerts notify the domain team to investigate pipeline bottlenecks

← Back to Data Mesh Architecture Overview