Data Integration Patterns • Data Federation PatternsHard⏱️ ~3 min
When to Use Federation vs Alternatives
The Core Decision Framework: Federation trades performance predictability and operational isolation for freshness and integration flexibility. Understanding when this trade off makes sense is critical for system design interviews.
Choose Federation When: You need real time or near real time access to data across multiple systems. Examples include Customer 360 dashboards showing live support tickets plus recent orders, regulatory reports joining live compliance data with historical records, or exploratory analysis where building a full ETL pipeline for a one time investigation is too expensive.
Federation works best for low to moderate query volume (under 20 queries per second), queries touching 2 to 4 sources, and workloads where most queries are selective (filtering to small result sets). If your sources respond in 200 to 500 milliseconds and you can tolerate p95 latencies of 3 to 5 seconds, federation is viable.
Choose ETL Plus Warehouse When: You have high query volume (over 50 queries per second), complex aggregations over large datasets (scanning 10 to 100 TB), or need consistent performance guarantees. Daily or hourly batch analytics, machine learning feature generation, and high traffic BI dashboards all favor warehouses.
The math is concrete. A 50 TB scan for daily aggregates in a warehouse with columnar storage and clustering takes 2 to 5 minutes and costs $10 to $20. The same via federation, pulling from 6 to 8 sources, takes 30 to 60 minutes, costs $40 to $80 in compute and egress, and risks timeout if any source is slow. When you run this daily, the warehouse is 10x cheaper and 10x faster.
Alternative Patterns: Point to point APIs between services avoid federation complexity but require each team to build and maintain integrations. This works for simple cases (one service calling another) but does not scale to ad hoc analytics across 10 to 20 systems.
Application level composition, where your code queries multiple services and joins results, gives you full control but pushes complexity to every application. This is appropriate for product features with specific data needs, not for general purpose analytics.
Change Data Capture (CDC) plus stream processing creates near real time replicas in a central store, combining warehouse performance with low latency (2 to 10 second lag). This is the best of both worlds but requires infrastructure investment. Use this for high value, high traffic use cases where both freshness and performance matter.
Organizational Considerations: In data mesh architectures, domains own their stores. Federation provides a cross domain access fabric without forcing data centralization. But you must honor domain Service Level Agreements (SLAs). If a domain cannot handle 500 analytical queries per second on their Online Transaction Processing (OLTP) database, materialize hot aggregates or shift to CDC replication.
Data Federation
Real time freshness, flexible integration, runtime dependencies on N systems
vs
ETL + Warehouse
Predictable performance, operational isolation, 15 minute to 24 hour staleness
"The decision is not federation OR warehouse. It is what percentage of your workload is high frequency heavy computation (warehouse) versus long tail exploratory cross system queries (federation)."
💡 Key Takeaways
✓Federation for low volume (under 20 QPS) cross system queries needing real time data; warehouse for high volume (over 50 QPS) or heavy aggregations (10 to 100 TB scans)
✓Warehouse aggregates are 10x faster and 10x cheaper: 50 TB scan in 3 minutes at $15 vs 40 minutes at $45 via federation
✓CDC plus stream processing combines warehouse performance with low latency (2 to 10 seconds), best for high value use cases needing both
✓Hybrid architecture is the production pattern: warehouse for 85 to 95 percent of volume, federation for 5 to 15 percent long tail exploratory queries
✓Choose based on read/write ratio and freshness needs: real time cross system reporting favors federation, daily batch analytics favors warehouse
📌 Examples
1Customer 360 dashboard showing live support tickets (Zendesk) plus recent orders (MySQL) plus account status (Salesforce): federation handles real time need across 3 sources
2Daily revenue aggregates scanning 80 TB of transactions: warehouse completes in 4 minutes at $18, federation would take 50 minutes at $60 and risk timeouts
3Real time fraud detection needing 50 QPS with under 100ms lag: CDC replicates to warehouse, avoiding federation's 500ms to 2 second latency