Demand Paging and Page Fault Latency Impacts
What Is Demand Paging
The OS does not load all memory immediately. When a process starts, pages are marked not present. First access triggers a page fault. The kernel loads the page from disk or initializes it, then resumes execution. This is demand paging: pages load on demand rather than upfront.
Benefits are significant. A 1 GB executable does not need 1 GB of RAM to start. Only accessed pages load. Many code paths never execute in a given run, so their pages never load. Memory usage reflects actual access patterns, not worst case size.
Page Fault Latency
Minor fault: Page exists in memory but not mapped. The kernel just updates the page table. Cost is roughly 1 to 10 microseconds. Common for copy on write pages after fork or for demand zero pages.
Major fault: Page must come from disk. SSD latency is 100 microseconds. HDD latency is 5 to 10 milliseconds. During this time, the faulting thread blocks. If critical path code triggers a major fault, user visible latency spikes by 1000x or more.
Prefaulting and Mlock
Latency sensitive systems avoid runtime page faults. Use mlock() to pin pages in RAM and prefault them at startup. This guarantees no major faults during operation. The trade-off is longer startup time and committed memory.
Databases and trading systems typically prefault their working set. During maintenance windows, they touch every page to ensure it is resident before accepting traffic. Combined with mlock, this prevents surprise latency from page faults.