Self Serve Platform: Standardized Infrastructure at Scale

The Platform Challenge:
When you decentralize data ownership to 50 domains, you face a risk: each domain reinventing storage patterns, security configurations, and monitoring. This creates chaos. The self serve platform solves this by abstracting common concerns into standardized, automated capabilities that every domain uses.

Core Platform Capabilities:
The platform provides an event backbone for streaming, typically Kafka or similar at massive scale handling 100,000 to 500,000 events per second across all domains. It offers provisioning APIs where a domain declares "I need a partitioned table updated from this stream" and the platform spins up storage, compute, and access control automatically. Orchestration handles both batch and streaming workflows, scheduling transformations and ensuring dependencies run in the correct order.

A unified metadata and catalog service is critical. Every data product registers here with schemas, lineage, SLOs, and documentation. When an analyst searches for payment data, they discover Payment Failed Events with full context: who owns it, what the schema is, what the freshness guarantee is, and who else uses it. Without this catalog at scale, you would have 200+ data products scattered across systems with no discoverability.

1
Domain declares requirements: "Partitioned table from Kafka topic orders.events, updated every 5 minutes, PII fields customer_email and shipping_address tagged."
2
Platform provisions infrastructure: Creates storage bucket, configures encryption at rest, sets up streaming ingestion job, applies retention policy (90 days), configures role based access control.
3
Automated governance applied: PII fields automatically masked by default, schema registered in catalog, monitoring dashboard created with ingestion lag and query latency graphs, lineage tracked from source topic to product.
4
Product goes live: Available in catalog within minutes. Consumers can discover, request access, and start querying with SLOs enforced automatically.
Governance Embedded in Tooling:
Federated governance defines global standards: naming conventions for fields like user_id versus customer_id, security policies for encryption and access, data classification levels (public, internal, confidential), and required quality checks. The key insight is these rules are not enforced by manual review. They are embedded in the platform.

When a domain creates a new product exposing user identifiers, the platform automatically checks which fields are sensitive based on classification policies. It configures masking, restricts access to approved roles, and tags the product for compliance audits. If a domain tries to create a product with a retention period violating regulatory requirements (for example, keeping PII longer than allowed), the platform rejects it at provisioning time.

This computational governance is what makes data mesh viable at scale. Manual governance through tickets and reviews does not scale to 200 data products and hundreds of engineers making changes daily.

⚠️ Common Pitfall: Building a "platform" that is just a wiki of instructions for manual setup. True self serve means declarative APIs and automation. If domain teams still file tickets and wait for the platform team to provision resources, you have not achieved self serve at all.
Integration Patterns:
For existing systems, the platform supports multiple integration patterns. Change Data Capture (CDC) from monolithic databases can feed domain products with analytical freshness under 5 minutes end to end. Event sourcing from operational microservices enables near real time products with sub second latency for use cases like fraud detection. Where only batch APIs are available, scheduled pulls might have hourly or daily refresh, which affects downstream SLAs and must be documented in the product contract.

💡 Key Takeaways

✓The platform provides event backbone (Kafka scale 100k to 500k events per second), provisioning APIs, orchestration for batch and streaming, and unified metadata catalog

✓Domain teams declare requirements at high level. Platform automates infrastructure: storage, encryption, access control, schema registration, monitoring, all provisioned in minutes

✓Governance rules are embedded in tooling, not manual review. Platform automatically enforces encryption, masking of PII fields, retention policies, and compliance tags at provisioning time

✓True self serve means declarative APIs and automation. If domains still file tickets and wait for manual provisioning, you have not achieved self serve

✓Integration supports Change Data Capture (CDC) with under 5 minute freshness, event sourcing with sub second latency, and batch APIs with hourly or daily refresh depending on source system capabilities

📌 Interview Tips

1When Orders domain declares a new product from Kafka topic <code>orders.events</code>, the platform provisions storage, sets up streaming ingestion every 5 minutes, applies encryption, masks PII fields <code>customer_email</code> and <code>shipping_address</code>, and creates monitoring dashboards, all within minutes

2At scale with 200+ data products, computational governance automatically tags products with sensitive data for compliance audits. When auditors request impact analysis for a regulation change, the platform queries metadata to identify all affected products instantly

3CDC from a legacy monolithic database feeds the Customer Profile data product with analytical freshness under 5 minutes, allowing the domain to expose clean analytical views while the operational system remains unchanged

← Back to Data Mesh Architecture Overview