Garbage Collection in Ruby

I was recently reading the excellent article by Jean Boussier about how Shopify has optimised their garbage collection. It got me to thinking about how garbage collection works in ruby and how little I really know about it... In the article they say that ruby has a;

conservative, generational, incremental, mark-and-sweep tracing garbage collector.

I wanted to dive in to find out what that meant.

Brief History


Garbage collection was first introduced by John McCarthy in 1959 as part of Lisp, one of the earliest programming languages. The idea was to automate memory management so that programmers wouldn’t need to manually allocate and deallocate memory, which was error-prone and labor-intensive. Early Garbage Collection (GC) methods included reference counting and mark-and-sweep techniques, which laid the groundwork for modern garbage collection algorithms.

Ruby's Garbage Collector


When Ruby was first released in 1995 it had a conservative mark-and-sweep garbage collector.

The conservative part means that the Garbage collector rather than keeping a strict list of root references to all objects, the collector would scan the stack and registers, treating any value that resembled a pointer as a potential reference to an object. This avoids collecting objects that might still be in use, however it can retain objects that are no longer reachable.

The mark-and-sweep garbage collection process, as its name suggests involves two stages;
  1. Mark Stage: It begins at "root" objects (such as global variables, stack variables and constant objects), traversing all reachable objects and marking them as still "in-use".
  2. Sweep Stage: It then iterates through all objects, deallocating any unmarked (unreachable) objects, effectively clearing them from memory.

Mark and Sweep

Ruby's original GC mark-and-sweep algorithm was slow for several reasons but some of the main ones were;

Ruby's original mark-and-sweep algorithm was a straightforward mark-and-sweep which treated all objects the same regardless of age. This meant that every GC cycle was a full sweep scanning all objects, even older objects which were unlikely to be garbage. As programs ran, more long lived objects were accumulated elongating the time the GC took as it had to traverse all objects even those unlikely to be garbage.

When the original GC ran it was a full stop the world garbage collection process, this paused the the entire application while the GC completed.

The original implementation the GC strategy didn't handle memory fragmentation well. As the object space grew so did the fragmentation, which reduced available memory and increased the time spent searching for contiguous memory blocks for new objects.

Generational Garbage Collection
Ruby 2.1 introduced Restricted Generational GC (RGenGC). This divides objects into two categories, old and young. Most objects in programs have a short lifespan (Koichi when first implementing generational GC found 95% of objects are cleaned in first Garbage Collection), so this enables the garbage collector to focus on cleaning up young objects often (called a minor GC) and older objects less often (called a major GC). One assumption that has to be made is that older objects will only point at other old objects, so once we find an old object we ignore the references. This significantly reduces the time spent during the marking processes; However the sweep time remains unchanged.

Minor Mark and Sweep - Generational GC


One potential issue was as old objects are ignored during the marking process, what happens if an old object does point at a new object?

Potential Bug with Generational GC


To handle this case RGenGC introduced write barriers so it would track these objects called the RSet. Objects in the RSet are treated as root objects which means that it will traverse from the old object to the new object correctly during the next GC cycle.

However C extensions would break if they didn't implement write barriers. For this case, shady objects were introduced. If an object is marked as shady it will never be promoted to being an old object. These shady objects will also be traced and checked during every marking.

There is an excellent talk from Ruby Conf 2013 by Koichi Sasada on the introduction of the RGenGC to ruby with much more details.

Incremental Marking

Due to the Generational Garbage Collection the marking process was much quicker for minor GC. Major GC was happening less often than before, as we only need to run a major GC when we hit a memory limit. However, if that limit is hit we still had a stop the world GC run, which could cause a noticeable pause in execution of the ruby program.

To address this issue, a Restricted Incremental GC (RIncGC) algorithm was added in Ruby 2.2. An incremental GC inter-leavers Ruby's execution with the GC Process (This is only needed to major GC).

Incremental GC


This reduces the time that ruby execution is paused, it does not change the total time spent marking. However this is still a significant improvement as execution is not paused for the whole time, for example a request that happens to have a major GC run during it will no longer necessarily have a significant slow down.

There is an excellent talk from Ruby Conf 2014 by Koichi Sasada on the introduction of the RIncGC to ruby with more details. (The link starts the video 28 minutes in when he starts talking about the incremental GC)

Conclusion

That is how in Ruby we have a conservative, generational, incremental, mark-and-sweep tracing garbage collector. I think its a feature that generally works fantastically, some of the best things in software in my opinion work well without you having to think about it. I enjoyed finding out over the years how different features have been added to the Garbage Collector to make the lives of ruby developers better.

Further Reading

As I said, generally you don't have to think about it, but if you are handling the number of requests that Shopify have to, then you do have to sometimes make optimisations to the Garbage Collector. There was a recent article on the Rails at scale blog describing how they don't run the major GC during requests and call it out of band once a request has finished.

I also saw this issue on the ruby lang website talking about up-streaming the current MMTk implementation into ruby. This would enable people to experiment with different GC algorithms. This sounds like an interesting piece of research that I look forward to reading.