OS & Systems FundamentalsProcesses vs ThreadsMedium⏱️ ~2 min

Communication Costs: Shared Memory vs IPC

The fundamental difference in communication mechanisms creates a 100 to 1000x performance gap. Threads communicate through shared memory and synchronization primitives. The cost of passing data between threads is essentially the cost of memory access and locking: tens of nanoseconds to a few hundred nanoseconds per operation when uncontended. This is as fast as reading and writing RAM. Processes require Inter Process Communication (IPC) mechanisms including shared memory segments, pipes, local sockets, or message queues. Even on the same physical machine, a round trip over a local Unix domain socket typically takes 10 to 100 microseconds. Shared memory IPC between processes can approach sub microsecond latencies with careful design, but requires explicit synchronization and memory ordering that adds complexity. This communication cost difference is central when deciding architectural boundaries. Redis versus Memcached illustrates this tradeoff perfectly. Redis uses a single threaded execution core (with optional threaded I/O) and achieves 1 to 2 million operations per second with pipelining on modern hardware. It avoids all lock contention by design. Memcached is multi threaded within one process, sharding its hash table across threads. By scaling across cores without cross process IPC overhead, Memcached often achieves 5 to 10+ million operations per second per server on 25 or 100 Gigabit Ethernet NICs.
💡 Key Takeaways
In process thread communication costs 50 to 200 nanoseconds for uncontended memory access and locking. This is just the speed of RAM access plus minimal CPU synchronization.
Cross process IPC on the same machine costs 10 to 100 microseconds per round trip via Unix domain sockets. Shared memory IPC can reach sub microsecond speeds but requires complex synchronization and careful memory ordering.
The 100 to 1000x communication cost difference drives architectural decisions. Keep hot path, high frequency communication within a single process using threads. Use process boundaries for isolation, not tight coordination.
Redis achieves 1 to 2 million ops per second single threaded by eliminating all lock contention. Memcached scales to 5 to 10+ million ops per second multi threaded by sharding state across threads while avoiding cross process overhead.
Kafka brokers demonstrate in process efficiency: one broker process with multiple internal threads (network, IO, replication) sustains multi gigabyte per second throughput on NVMe and 25 to 100 GbE links by coordinating via in process queues.
📌 Examples
Thread handoff in high performance server: Lock acquisition + queue push = 50-200ns. Can handle millions of handoffs per second per core.
PostgreSQL backend process IPC: Client sends query over Unix socket to backend process. Round trip: 20-50 microseconds just for IPC, before query execution.
Memcached multi-threaded: Hash table sharded across 16 threads. Each thread owns its shard, no cross-thread locking on hot path. Result: 10M+ ops/sec on 100 GbE.
← Back to Processes vs Threads Overview
Communication Costs: Shared Memory vs IPC | Processes vs Threads - System Overflow