What is Garbage Collection and How Does It Work?
Why GC Exists
Manual memory management is error prone. Free too early and you get use after free bugs. Free too late or never and you leak memory. GC eliminates both classes of bugs by tracking object reachability automatically.
The cost is runtime overhead. The GC must periodically scan memory to find garbage. During collection, application threads may pause. This trade-off defines GC design: maximize throughput while minimizing pause times.
Basic Collection Algorithms
Mark and sweep: Starting from roots (stack, globals), mark all reachable objects. Then sweep through heap, freeing unmarked objects. Simple but causes fragmentation and requires stop the world pause.
Copying collection: Divide heap into from space and to space. Copy live objects from from space to to space, compacting as you go. Swap spaces. Eliminates fragmentation but wastes half the heap.
Reference counting: Track reference count per object. Free when count reaches zero. No pause needed but cannot handle cycles and has overhead on every reference update.
Roots and Reachability
GC starts from roots: stack frames, CPU registers, global variables. Any object reachable from roots is live. Reachability is transitive: if A references B and A is live, B is live. Unreachable objects are garbage by definition, even if they reference each other (cycles).