Trade-offs: Isolation vs Performance at Scale

The Isolation vs Performance Spectrum
Running everything in a single thread gives maximum performance for shared data but zero isolation. Running each task in its own process gives maximum isolation but incurs IPC overhead for every interaction. Real systems choose based on communication frequency, crash impact, and code trust level.
Quantifying the Communication Tax
Threads communicate in nanoseconds; processes via sockets in microseconds, roughly 1000x slower. At one million messages per second:
With threads: 1M × 100ns = 100ms overhead per second.
With processes: 1M × 5μs = 5 seconds overhead, consuming 500% of a CPU core.
Below 10,000 messages per second, IPC overhead is under 5% of a core. Above 100,000, IPC becomes a significant bottleneck.
Crash Containment Math
Thread crashes kill the entire process. At 1000 RPS with 500ms restart time, one crash drops 500 in flight requests.
With 10 process workers at 100 RPS each, a crash drops only 50 requests, 10x fewer.
Security Boundaries
Threads share all memory; any thread can read any data. Processes provide actual security boundaries enforced by the OS. For multi tenant systems, processes offer stronger isolation guarantees.
⚠️ Key Trade-off: Below 10,000 msg/s, prefer processes for isolation. Above 100,000 msg/s, threads become necessary. In between, decide based on crash frequency and security needs.

💡 Key Takeaways

✓Thread communication costs ~100ns; process IPC costs ~5μs, a 1000x difference that matters at high message rates

✓At 1M messages per second, thread overhead is 100ms while process IPC overhead consumes 500% of a CPU core

✓Below 10,000 messages per second, IPC overhead is under 5% of a core; above 100,000 it becomes a bottleneck

✓Thread crashes kill the entire process and all in flight work; process crashes affect only one worker

✓Processes provide actual security boundaries; threads share all memory making data isolation impossible

📌 Interview Tips

1When asked about microservices granularity, explain that services communicating millions of times per second should merge into one process or use shared memory IPC

2For multi tenant architectures, recommend process per tenant when data isolation is critical, even at the cost of higher resource usage

3Calculate crash impact: 10 workers × 100 RPS × 500ms restart = 50 dropped requests per crash vs 1 process × 1000 RPS = 500 dropped

← Back to Processes vs Threads Overview