ETL/ELT PatternsTransformation Layers (Bronze/Silver/Gold)Hard⏱️ ~3 min

Multi Layer Trade-Offs and Failure Modes

The Storage and Complexity Cost: The medallion architecture stores data multiple times across layers. If you ingest 100 TB daily and maintain Bronze, Silver, and Gold, you might store 200 to 400 TB total (Bronze + Silver + multiple Gold variants). At cloud storage rates around $20 per TB per month, this costs $4,000 to $8,000 daily or $1.5 to $3 million annually. You are explicitly trading cost for better lineage, governance, and debugging capability. Compare this to a simpler two layer model (raw plus curated) or directly loading into a data warehouse. These alternatives use less storage and have fewer pipeline hops, but sacrifice forensic capability. When a KPI looks wrong, you cannot easily trace back through transformation layers to find where corruption was introduced. Without immutable Bronze, you cannot replay history with corrected logic. Organizational Scaling: The medallion model enables organizational scaling by establishing clear ownership boundaries. Central platform teams own Bronze and core Silver tables. Domain teams own their Platinum or functional mart layers. A small governance council owns critical Gold KPIs. This separation prevents tight coupling between raw sources and business metrics, which is a common failure mode in monolithic warehouses where a single schema change can break dozens of reports. However, this requires process maturity. You need change management policies for Silver schema evolution, SLA definitions for each layer, and monitoring to catch freshness or quality issues early. Smaller organizations (under 50 engineers) often find this overhead excessive and prefer simpler architectures. Medallion shines at scale: hundreds of data sources, hundreds of pipeline developers, thousands of data consumers.
Architecture
Best For
Avoid When
Medallion (3-4 layer)
100+ sources, complex governance
Small teams, simple pipelines
Two layer (raw + curated)
Startups, fast iteration
Regulatory requirements
Direct warehouse load
Structured sources only
Semi structured, ML workloads
Critical Failure Modes: First, treating Bronze as an afterthought. If ingestion code drops malformed records silently or does not version schemas, you lose the forensic value. When a subtle schema change in a source system corrupts downstream metrics six months later, you need Bronze to reconstruct what actually happened. Without it, debugging becomes guesswork. Second, Silver becoming a dumping ground for team specific logic. If Marketing adds a filter to the Silver customer table for "only customers in our target segments," Finance cannot use that table. This creates duplicate Silver tables with subtly different logic, defeating the purpose of a shared, conformed layer. Business specific filters belong in Platinum or Gold, not Silver. Third, KPI sprawl in Gold. If every team publishes its own version of "monthly active users" with different activity thresholds and time windows, executives see conflicting numbers. This is why many organizations restrict Gold to a small, centrally governed set of KPIs and push team specific metrics to Platinum. Scale Breaking Points: At 10x scale, several issues emerge. First, if Gold is built from many small fragmented tables rather than a few well modeled ones, query engines struggle to maintain sub second latency. Second, naive transformation chains can cause expensive recomputation cascades: a small upstream change forces petabytes of downstream recomputation, breaking SLAs. This requires incremental processing, partition pruning, and selective backfills. Third, near real time requirements challenge the model. If business needs require Silver to Gold latency under 1 minute, you may need separate low latency paths for specific critical metrics, bypassing normal processing. Alternatively, allow expert users direct access to Bronze or Silver for time sensitive analysis, with clear disclaimers about quality and stability.
"The medallion model is not about technology. It is a conceptual pattern for managing complexity at scale through separation of concerns, clear contracts, and explicit ownership boundaries."
💡 Key Takeaways
Medallion stores data 2 to 4 times across layers, costing $1.5 to $3 million annually for 100 TB daily ingestion at typical cloud storage rates
The architecture enables organizational scaling with clear ownership: platform owns Bronze/Silver, domains own Platinum/Gold functional layers, governance council owns critical KPIs
Critical failure mode: Silver becoming a dumping ground for team specific logic creates duplicate tables with conflicting definitions and breaks the shared conformed layer concept
At 10x scale, naive transformation chains cause expensive recomputation cascades; requires incremental processing and selective backfills to maintain SLAs
Choose medallion for 100+ sources with complex governance needs; prefer simpler two layer architectures for small teams or startups prioritizing speed over lineage
📌 Examples
1When a source API changes a field from <code>order_timestamp</code> to <code>order_date</code> without warning, Bronze captures both versions over time. Six months later, when finance reports show anomalies, engineers query Bronze to identify exactly when the change occurred and which records were affected, enabling targeted Silver reprocessing.
2A company creates 15 different Gold tables for "monthly active users" with varying definitions: some count any activity, others require purchases, some use 28 day windows, others calendar months. When the CEO asks "what are our MAUs," teams cannot agree. After governance intervention, they consolidate to one centrally defined MAU metric in Gold, moving team specific variants to Platinum.
← Back to Transformation Layers (Bronze/Silver/Gold) Overview