Linux CFS Scheduler and Production Impact
CFS Core Concept
CFS (Completely Fair Scheduler) tracks virtual runtime for each thread. Virtual runtime is CPU time consumed weighted by priority. The scheduler always runs the thread with lowest virtual runtime. Threads that used less CPU get priority, ensuring fairness over time.
CFS uses a red black tree to track threads by virtual runtime. Picking the next thread is O(1): take the leftmost node. Adding or removing threads is O(log n). This scales well to thousands of runnable threads while maintaining fairness.
Time Slice and Latency
CFS does not use fixed time slices. Instead, it calculates target latency: the time for all runnable threads to run once. Default is 6 milliseconds for up to 8 threads, increasing with thread count. Each thread gets proportional share of target latency based on its weight.
More runnable threads means smaller slices per thread. With 60 threads and 6ms target latency, each gets 100 microseconds before preemption. High context switch rates result. If your service runs 100 threads competing for 4 cores, context switches dominate execution time.
Priority and Nice Values
Nice values range from -20 (highest priority) to +19 (lowest). Each nice level changes CPU weight by approximately 10 percent. A nice -20 process gets roughly 10x more CPU time than nice +19. This is relative: if only one thread is runnable, it gets all CPU regardless of nice.
In production, nice values help differentiate workloads. Set batch jobs to nice +10. Set latency critical services to nice -5. The scheduler gives preference without absolute preemption. But under high load, nice differences compress: everyone gets less, just at different rates.