Search & Ranking SystemsQuery Parsing & OptimizationMedium⏱️ ~2 min

Predicate Pushdown and Projection Pruning: Minimizing Data Movement

Predicate Pushdown

Predicate pushdown moves filter conditions closer to the data source. Consider: SELECT o.id, c.name FROM orders o JOIN customers c ON o.customer_id = c.id WHERE c.created_at > '2024-01-01'. Without pushdown, we read all 10 million orders, join to 1 million customers, then filter. With pushdown, we filter customers first (1M to 200K rows), then join. The optimizer rewrites this as: SELECT o.id, c.name FROM orders o JOIN (SELECT * FROM customers WHERE created_at > '2024-01-01') c ON o.customer_id = c.id. Result: 80% fewer rows processed in the join.

Projection Pruning

Projection pruning eliminates unreferenced columns. For SELECT customer_id, total FROM orders on a table with 50 columns averaging 100 bytes each (5KB per row), we only need 2 columns (~20 bytes). In row storage, all columns still read from disk but fewer bytes transfer over network. In columnar storage like Parquet, only the customer_id and total column files are read: 96% less I/O. A 1 terabyte table with 50 columns queried for 2 columns reads only 40 gigabytes from disk.

Pushdown Across Systems

Federated queries push predicates to remote sources. For SELECT * FROM remote_db.orders WHERE created_at > '2024-01-01' AND status = 'pending', pushing both predicates to the remote database filters there, transferring only matching rows. Without pushdown: transfer 10 million rows, then filter locally. With pushdown: remote returns 50,000 matching rows. Network transfer drops from 5 gigabytes to 25 megabytes. Object storage queries similarly benefit: scanning 1TB of Parquet with WHERE region = 'US' and projection of 3 columns might read only 10 gigabytes when pushdown and partition pruning combine.

Partition Pruning

Partition pruning eliminates entire data partitions. For a table partitioned by date: SELECT * FROM events WHERE event_date = '2024-06-15' on 365 daily partitions containing 1TB total reads only that day's partition (2.7 gigabytes). But SELECT * FROM events WHERE event_timestamp BETWEEN '2024-06-15 23:00' AND '2024-06-16 01:00' spans two partitions because the predicate crosses midnight. Partition keys should align with common query patterns.

💡 Key Takeaways
Predicate pushdown rewrites SELECT ... FROM orders JOIN customers WHERE c.created_at > X to filter customers first (1M to 200K rows), reducing join input by 80%
Projection pruning on SELECT customer_id, total FROM orders (2 of 50 columns) reduces columnar storage reads by 96%; 1TB table reads only 40GB
Cross system pushdown critical: remote query with WHERE created_at > X AND status = 'pending' transfers 25MB instead of 5GB by filtering at source
Partition pruning on WHERE event_date = '2024-06-15' reads 2.7GB of 1TB (one partition of 365); predicates must align with partition boundaries
Predicates spanning partition boundaries like BETWEEN '23:00' AND '01:00' crossing midnight require scanning multiple partitions
📌 Interview Tips
1Show the rewrite: original SELECT o.id FROM orders o JOIN customers c ON ... WHERE c.region = 'US' becomes JOIN (SELECT * FROM customers WHERE region = 'US') c, filtering 1M to 200K before join.
2Explain columnar benefit: SELECT id, amount FROM transactions (2 of 50 columns) on Parquet reads only those column files. 1TB table becomes 40GB actual I/O.
3Discuss partition alignment: daily partitions work for WHERE date = X, but hourly queries crossing midnight (23:00 to 01:00) scan two partitions. Consider hourly partitioning for such workloads.
← Back to Query Parsing & Optimization Overview
Predicate Pushdown and Projection Pruning: Minimizing Data Movement | Query Parsing & Optimization - System Overflow