Skip to content
playX edited this page Aug 8, 2021 · 1 revision

Comet

Comet is a general-purpose garbage collector library. It uses a simple mark-and-sweep technique to collect garbage. For identifying root pointers system of marking constraints is provided, they are used to tell GC where to find root objects.

Allocation

For allocating memory Comet uses Immix for small objects (<8KB) and regular libc::malloc for larger allocations. By using immix we achieve quite fast allocation times because every allocation is simply a bump pointer.

Normal allocator

A normal allocator is used to allocating objects smaller than a single line size (256B in Comet). Objects might be allocated on multiple lines if required. To allocate memory this allocator scans for a hole in a block by searching first unmarked and then first marked line which is a hole we can allocate into. If no holes were found then we try to recycle the block from the recyclable list and then we could try to allocate a new block.

Overflow allocator

Overflow allocator is used to allocating objects larger than line size. Instead of scanning for the holes in a block, it will use an entire block for allocating, and if the block is full the new block is requested for allocation.

Large allocator

Large allocator simply used libc::malloc for allocation and does not have anything hard to understand.

Conservatism

The collector scans the stack and registers conservatively, that is, checking each word to see if it is in the bounds of some object and then marking it if it is. This means that all of the Rust compiler-generated code can store heap pointers in local variables without any hassles.

Object header

Heap object header is just 4 bytes on all platforms: 2 bits for GC color, 14 bits for object size, 14 bits for GC info index.

object size

If the size of the object is zero then it is stored in the header of large allocation which is four words behind our object header otherwise we store size divided by allocation granularity.

GC color

GC color is used when marking. Right now it has no real use but in the future when we will implement concurrent marking GC colors might be very useful.

GC info index

For tracing & finalizing GC needs to know how to trace and finalize objects. To do so we have a global side table that stores trace and finalize callbacks for GC types.

GC Cycle

Pre-mark

Before marking GC visits all blocks & large allocations and clears their mark bits, then it sorts large allocations so it is possible to conservatively scan stack or another region of memory for potential root objects.

Marking

In this phase GC executes constraints visiting root objects and then simple processes worklist which is a Vec<*mut HeapObjectHeader>.

Sweeping

While sweeping GC walks all blocks and sweeps them invoking finalizers for objects (if they're provided). Empty blocks are returned to the block allocator and unmapped (using madvise on Unix) and non-empty blocks are returned to the normal allocator so it could try to reuse them. Note that the overflow allocator doesn't get recycled blocks, instead it always allocates new Immix blocks.

Work in progress features

Opportunistic evacuation

Blocks that are very fragmented could be evacuated and that's what we should try to do.