What is CPU Scheduling and Context Switching?
Why Scheduling Matters
A modern server runs hundreds of threads on perhaps 32 cores. Without scheduling, threads would run to completion before others start. Long tasks would starve short ones. Scheduling time slices execution so all threads progress. The scheduler decides how to divide CPU time fairly and efficiently.
Scheduling directly affects latency. If a user request thread waits in the run queue while batch jobs consume their time slices, response time suffers. Schedulers try to balance fairness with responsiveness, but trade-offs are unavoidable.
Context Switch Mechanics
A context switch saves the current thread state: CPU registers, program counter, stack pointer. It then loads the next thread state. This takes 1 to 10 microseconds depending on hardware and whether address space changes.
Thread switch within the same process is cheaper. Only CPU registers need saving. Process switch requires changing the address space too: flushing TLB or tagging entries, updating page table base register. This adds 10 to 50 microseconds and causes TLB misses for subsequent accesses.
Hidden Costs Beyond The Switch
The direct context switch cost is just the beginning. The bigger cost is cache pollution. A new thread has cold caches. L1 data cache refills cost 4 cycles each. L3 cache misses cost 50 to 100 nanoseconds each. A context switch can effectively invalidate megabytes of warmed cache.