OS & Systems FundamentalsGarbage Collection FundamentalsEasy⏱️ ~2 min

What is Garbage Collection and How Does It Work?

Definition
Garbage collection automatically reclaims memory that is no longer reachable by the program. The runtime identifies objects with no references and frees their memory without explicit programmer action.

Why GC Exists

Manual memory management is error prone. Free too early and you get use after free bugs. Free too late or never and you leak memory. GC eliminates both classes of bugs by tracking object reachability automatically.

The cost is runtime overhead. The GC must periodically scan memory to find garbage. During collection, application threads may pause. This trade-off defines GC design: maximize throughput while minimizing pause times.

Basic Collection Algorithms

Mark and sweep: Starting from roots (stack, globals), mark all reachable objects. Then sweep through heap, freeing unmarked objects. Simple but causes fragmentation and requires stop the world pause.

Copying collection: Divide heap into from space and to space. Copy live objects from from space to to space, compacting as you go. Swap spaces. Eliminates fragmentation but wastes half the heap.

Reference counting: Track reference count per object. Free when count reaches zero. No pause needed but cannot handle cycles and has overhead on every reference update.

Roots and Reachability

GC starts from roots: stack frames, CPU registers, global variables. Any object reachable from roots is live. Reachability is transitive: if A references B and A is live, B is live. Unreachable objects are garbage by definition, even if they reference each other (cycles).

💡 Key Insight: GC trades CPU overhead and pause times for memory safety. Modern GCs achieve sub-millisecond pauses for most applications. The key design dimensions are throughput, latency, and memory overhead.
💡 Key Takeaways
GC automatically frees unreachable objects, preventing leaks and use after free bugs
Mark and sweep: trace from roots, free unmarked objects, causes fragmentation
Copying collection: copy live objects to new space, compacts but uses 2x memory
Reference counting: immediate free on zero count but cannot handle cycles
Trade-off: GC exchanges CPU overhead and pauses for memory safety
📌 Interview Tips
1Explain why reference counting alone is insufficient: cyclic references never reach zero count even when unreachable
2When discussing GC overhead, mention the three dimensions: throughput (how much CPU GC uses), latency (pause length), memory (heap overhead)
3Describe roots as starting points: stack variables, global references, CPU registers. Everything reachable from roots is live
← Back to Garbage Collection Fundamentals Overview