Database DesignDocument Databases (MongoDB, Firestore)Medium⏱️ ~3 min

Index Design for Predictable OLTP Performance

Index design determines whether your query executes in 5ms with a bounded index scan or 2 seconds with a full collection scan. Document databases evaluate queries via indexes, and missing or poorly structured indexes force the engine to read and sort many documents in memory. The compound index rule is critical: equality filters first, then range filters, then sort fields. An index on (country, productId, createdAt) supports queries filtering by country and product, sorted by date, but reversing the order to (createdAt, country, productId) breaks this efficiency. A real production example from Dev.to analysis shows the impact: a query for last 10 orders filtered by country and product, sorted by createdAt descending. In MongoDB with a proper compound index on (country, productId, createdAt), the engine scans in index order and returns 10 documents without touching the rest of the collection, achieving sub-10ms latency regardless of collection size. In Firestore MongoDB compatibility mode, the same query with a multi-key index could not leverage index order for sorting and had to read and sort many documents, taking approximately 2 seconds even on small collections. Array fields amplify complexity. Indexing an array as multi-key creates one index entry per array element. A document with 1,000 tags generates 1,000 index entries, multiplying write cost and storage. Worse, multi-key indexes on multiple arrays in one index are prohibited in MongoDB, and queries combining array membership with sorts may not use index ordering efficiently. Keep indexed arrays bounded (under 100 elements), or move to a separate child collection with one document per item. Pagination adds another layer. Offset-based pagination with skip(N) is O(N): to skip 10,000 documents, the database must walk through 10,000 index entries. For deep pagination, latency grows linearly. Cursor-based pagination with the last seen sort key (and a stable tiebreaker like document ID) resumes from a specific index position, keeping scans bounded. Google and Meta use cursor-based pagination for feeds with millions of items to maintain consistent sub-50ms p99 latency.
💡 Key Takeaways
Compound index ordering rule: equality filters first, then range filters, then sort fields. Index (country, productId, createdAt) works for filter on country and productId with sort on createdAt, but (createdAt, country, productId) forces collection scan for same query
Multi-key array indexes create one entry per array element: document with 1,000 tags generates 1,000 index entries, multiplying write latency by 1000x and storage cost proportionally
Non-covering sorts trigger in-memory sorting: if sort field is not in index or not trailing, engine reads many documents and sorts them in RAM, causing unpredictable latency scaling with data volume
Offset pagination with skip(10000) walks 10,000 index entries every time, causing O(N) latency. Cursor-based pagination using last seen key (e.g., createdAt + docId) resumes at index position for O(1) performance
Index selectivity matters: low selectivity field like status with 3 values as first index field forces scan of 33% of collection, high selectivity field like userId as first field scans only user's documents
Missing indexes cause collection scans: query without matching index reads entire collection, 10ms query becomes 5000ms on 1M documents, critical for production SLA guarantees
📌 Examples
MongoDB index db.orders.createIndex({ country: 1, productId: 1, createdAt: -1 }) supports query db.orders.find({ country: "US", productId: "P123" }).sort({ createdAt: -1 }).limit(10) with bounded scan
Array write amplification: document { tags: ["a", "b", "c", ...1000 tags] } with index on tags generates 1,000 index entries, single update takes 500ms instead of 5ms
Cursor pagination: return lastCreatedAt and lastDocId to client, next page queries createdAt < lastCreatedAt OR (createdAt = lastCreatedAt AND docId > lastDocId) to resume at exact position
Dev.to analysis: Firestore query filtering by country and product with createdAt sort scanned and sorted many documents taking ~2 seconds, MongoDB with compound index returned in 5ms
← Back to Document Databases (MongoDB, Firestore) Overview
Index Design for Predictable OLTP Performance | Document Databases (MongoDB, Firestore) - System Overflow