Demand Paging and Page Fault Latency Impacts

What Is Demand Paging
The OS does not load all memory immediately. When a process starts, pages are marked not present. First access triggers a page fault. The kernel loads the page from disk or initializes it, then resumes execution. This is demand paging: pages load on demand rather than upfront.
Benefits are significant. A 1 GB executable does not need 1 GB of RAM to start. Only accessed pages load. Many code paths never execute in a given run, so their pages never load. Memory usage reflects actual access patterns, not worst case size.
Page Fault Latency
Minor fault: Page exists in memory but not mapped. The kernel just updates the page table. Cost is roughly 1 to 10 microseconds. Common for copy on write pages after fork or for demand zero pages.
Major fault: Page must come from disk. SSD latency is 100 microseconds. HDD latency is 5 to 10 milliseconds. During this time, the faulting thread blocks. If critical path code triggers a major fault, user visible latency spikes by 1000x or more.
Prefaulting and Mlock
Latency sensitive systems avoid runtime page faults. Use mlock() to pin pages in RAM and prefault them at startup. This guarantees no major faults during operation. The trade-off is longer startup time and committed memory.
Databases and trading systems typically prefault their working set. During maintenance windows, they touch every page to ensure it is resident before accepting traffic. Combined with mlock, this prevents surprise latency from page faults.
✅ Best Practice: For latency critical paths, measure page fault rates. If major faults occur on hot paths, prefault the working set at startup. Use mlock to prevent the kernel from swapping out critical pages under memory pressure.

💡 Key Takeaways

✓Demand paging: pages load on first access via page fault, not upfront

✓Minor fault: 1 to 10 microseconds, page exists but not mapped

✓Major fault: 100 microseconds for SSD, 5 to 10 milliseconds for HDD

✓Use mlock to pin pages and prefaulting to avoid runtime major faults

✓Latency critical systems prefault working set during startup or maintenance

📌 Interview Tips

1Explain why a 1 GB program does not need 1 GB RAM: demand paging only loads pages that are actually accessed

2When discussing latency spikes, check for major page faults. A single disk read can add 10ms to request latency

3For latency sensitive systems, recommend mlock and prefaulting: touch every page at startup to avoid runtime faults

← Back to Memory Management & Virtual Memory Overview