Partitioning & Sharding • Secondary Indexes with PartitioningMedium⏱️ ~3 min
How Do Covering Indexes and Projections Optimize Secondary Index Performance?
A covering index, also called a projected index, stores a subset of base record attributes directly within the index entry alongside the term and primary key. Instead of the index returning only primary keys that require a second fetch from the base table, a covering index returns all attributes needed to satisfy the query in a single lookup. For example, an index on country that projects user_id, country, and email can answer queries like "find email addresses of users in country equals UK" without touching the base table partitions at all. This eliminates one entire round trip and cross partition fetch, cutting latency roughly in half and reducing load on base storage.
The tradeoff is increased storage overhead and write cost. Each projected attribute must be stored redundantly in the index. If the base record is 1 kilobyte and you project 300 bytes of attributes into an index, the index adds 30 percent storage overhead. With multiple indexes, overhead compounds: 3 indexes each projecting different 300 byte subsets can add 90 percent storage on top of base data. Write amplification also increases because every update to a projected attribute must propagate to the index. DynamoDB charges both storage and write capacity for projected attributes in Global Secondary Indexes (GSIs), so maintaining 5 GSIs with projections can multiply storage costs by 2 to 3 times and write costs by 4 to 6 times compared to primary key access alone.
Covering indexes are most valuable for read heavy workloads where queries consistently access the same small attribute set. For example, an API that returns user profiles with user_id, name, and avatar_url indexed by email can serve requests in a single 10 millisecond index lookup instead of 10 milliseconds for index plus 10 milliseconds for base fetch plus coordinator overhead, reducing p99 from 30 to 40 milliseconds down to 15 to 20 milliseconds. However, if queries vary widely in which attributes they need, projections become inefficient: either you project many attributes (high storage cost) or queries still require base fetches (no latency win). Amazon teams commonly project 3 to 5 high value attributes per index, balancing coverage of top query patterns against storage and write costs.
💡 Key Takeaways
•Latency reduction is substantial for covered queries, eliminating one fetch round trip. A single index lookup at 10 milliseconds replaces index lookup at 10 milliseconds plus base fetch at 10 milliseconds plus network overhead, often cutting total latency from 30 to 40 milliseconds down to 15 to 20 milliseconds.
•Storage overhead scales with the number of indexes and projected attribute size. Projecting 300 bytes per index across 4 indexes adds roughly 1.2 kilobytes per base record, potentially doubling total storage when base records are small.
•Write amplification increases because updates to projected attributes must propagate to every index projecting that attribute. Changing an email field that appears in 3 GSI projections generates 1 base write plus 3 index writes, quadrupling write traffic for that attribute.
•Query flexibility is limited. If a query needs an attribute not projected, it must fall back to fetching from the base table, losing the covering benefit. Overprojecting to cover all possible queries wastes storage and write capacity on rarely used attributes.
•DynamoDB GSI projections support three modes: keys only (just the index key and primary key, minimal storage), include (explicitly list projected attributes), and all (project every attribute, maximum storage and write cost).
•Best suited for workloads where 70 to 90 percent of queries access a small, stable attribute set. Analytics or administrative queries that need many attributes should fetch from base tables or use a separate analytics store.
📌 Examples
DynamoDB GSI example: An index on order_status projects order_id, status, created_at, and total_amount. Queries for recent orders by status fetch 200 records in 25 milliseconds from the GSI alone. Without projections, the same query requires 25 milliseconds for index lookup plus 200 base fetches at 5 to 10 milliseconds each in parallel, totaling 50 to 80 milliseconds.
E commerce product search indexes by category and projects product_id, name, price, and thumbnail_url. Search results display these 4 attributes for 50 products per page, served entirely from the index in 15 milliseconds. Clicking a product fetches the full record with description, reviews, and inventory from the base table in another 10 milliseconds.
User profile service indexes by email and projects user_id, email, name, and avatar_url. The authentication flow queries by email and returns these 4 fields in 8 milliseconds at p50. Projecting the full profile (2 kilobytes) would increase GSI storage by 5 times and write cost by 3 times with no latency benefit for the auth flow.