CS164 Section Notes 11/26/2003 Original notes by Matt Harren Garbage collection 1. Incremental and/or generational collection 2. GC in Cool Incremental Collection: Idea: have many short pauses instead of 1 long pause for each collection. Better for real-time programs. We'll extend Stop & Copy to an incremental approach in which we do part of a collection, run the program, do a little more collecting, etc. You should identify the two issues below (at least) when deciding how we can allow the program to run in the middle of a collection -- this is a good test of your understanding of S&C. Two key invariants of S&C: 1) Some objects in the old space have been replaced by forwarding pointers. 2) The "copied and scanned" area and the root set contain pointers only to objects in the new space (e.g. "copied" plus "copied and scanned"). Solution: Compiler support 1) Whenever the program dereferences a pointer, it should check for a forwarding pointer, and follow it. 2) Whenever the program writes a pointer to a heap location, tell the GC. The GC checks whether we're creating a pointer from the "copied and scanned" space or the root set to the "old" space. If so, it magically fixes the problem (probably by copying the destination of the pointer to "copied"). Generational Collection (Simplified): Idea: many objects are live for very short amounts of time (e.g. one function call). So focus our collection efforts on the most recently-created objects. We'll use a setup with 3 separate regions. +-------------------+-------------------+------------+ | old1 | old2 | nursery | +-------------------+-------------------+------------+ old generation new gen The old generation uses normal stop & copy, with old1 and old2 taking turns being the "old" and "new" (aka "from" and "to") spaces of S & C. All new allocations are done in the nursery. When the nursery fills up, we do a "minor" collection. Any live objects in the nursery are moved to the old generation. After many iterations of this, old will fill up and we'll need to do a "major" collection that cleans up old too. To improve performance, require that there be no pointers from the old generation into the nursery. This means that when collecting the nursery, we only have to scan the root set and the nursery itself, rather than the whole heap. (motivation: this works very well for functional languages and for object-oriented languages, where you tend to typically have newer objects pointing to older ones, but not the other way around. For example, in Java, the arguments to a constructor must be objects that have already been constructed, thereby leading to newer-to-older pointers.) Key invariant: 1) No pointers from the old generation to the nursery, (unless pointer is also in "remembered set"; see below.) Solution: Compiler support 1) Whenever you update an existing pointer on the heap, you may be creating a pointer from old to the nursery, so tell the GC about it. If you are creating such a pointer, the GC will keep a copy of the pointer in a separate data structure. This data structure will be part of the root set for the next minor collection. Extensions: -- more than 2 generations. SML/NJ uses 9. -- fancy heuristics for deciding how long to keep an object in one place before promoting it to the next generation.