Memory leak has been one of the most difficult problems to track down in the C/C++ world. It's also true when it comes to Ruby C extensions. We don't expect memory bloat for a long-running Ruby service. Fortunately, the community has developed great tools to help detect memory leaks. Two of the most popular ones are AddressSanitizer (or ASAN) and Valgrind. Here is a comparison between these tools. I personally favor ASAN more as it does not require additional tools to be installed and runs fast.

It's better to run memory detection in a Linux environment. The tests of this post were run in Ubuntu 20.04 + GCC 9.4.0

Configuration

Let's take Nokolexbor as an example. Enabling ASAN is as easy as adding -fsanitize=address to the CFLAGS and LDFLAGS. In extconf.rb, add

if ENV['NOKOLEXBOR_DEBUG'] || ENV['NOKOLEXBOR_ASAN']
  CONFIG["optflags"] = "-O0"
  CONFIG["debugflags"] = "-ggdb3"
end

if ENV['NOKOLEXBOR_ASAN']
  $LDFLAGS << " -fsanitize=address"
  $CFLAGS << " -fsanitize=address -DNOKOLEXBOR_ASAN"
end

It is recommended to compile with -O0 -ggdb3 when enabling ASAN to reveal as much information as possible when memory leak is detected.

Note that we shall use standard memory functions malloc, realloc, calloc and free instead of ruby_xmalloc, ruby_xrealloc, ruby_xcalloc and ruby_xfree because if we use the latter ones, the call stack cannot be shown correctly in the memory leak reports, making it useless for analysis. Here we are defining NOKOLEXBOR_ASAN so that we can control which memory functions to use

#ifndef NOKOLEXBOR_ASAN
  lexbor_memory_setup(ruby_xmalloc, ruby_xrealloc, ruby_xcalloc, ruby_xfree);
#else
  lexbor_memory_setup(malloc, realloc, calloc, free);
#endif

Now, just compile with

NOKOLEXBOR_ASAN=1 rake compile

Detection

ASAN starts to detect memory leaks when the target program is shutting down, checking if any memory blocks allocated are not freed. Therefore, your test program should cover the code paths as much as possible. Here, we utilize our existing tests that are run by rake test.

The ASAN runtime has to be manually loaded into the ruby process before running our code. This can be done by setting the environment variable LD_PRELOAD=/path/to/libasan.so, where the path of libasan.so can be retrieved by gcc -print-file-name=libasan.so. The launch command looks like this

LD_PRELOAD=/path/to/libasan.so /path/to/ruby -Ilib -rnokolexbor /path/to/some_spec.rb

Note that /path/to/ruby should refer to the binary program of ruby, not a script that is created by rvm or rbenv.

When running the above command, you will find that ASAN reports memory leaks even if you don't pass an empty ruby file. This is because Ruby itself does not free all of the memory during shutdown, resulting in false positive reports. You will probably see the output like this:

==3930==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 956376 byte(s) in 7481 object(s) allocated from:
    #0 0x7fe00f23aa06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x7fe00ee54929 in calloc1 /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:1583
    #2 0x7fe00ee54929 in objspace_xcalloc /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:10113
    #3 0x7fe00ee54929 in ruby_xcalloc_body /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:10120
    #4 0x7fe00ee54929 in ruby_xcalloc /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:12004

Direct leak of 247663 byte(s) in 2055 object(s) allocated from:
    #0 0x7fe00f23a808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x7fe00ee54764 in objspace_xmalloc0 /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:9861
    #2 0x7fe00ee54764 in ruby_xmalloc2_body /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:10104
    #3 0x7fe00ee54764 in ruby_xmalloc2 /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:11994

Direct leak of 108437 byte(s) in 1150 object(s) allocated from:
    #0 0x7fe00f23ac3e in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:163
    #1 0x7fe00ee54ff1 in objspace_xrealloc /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:9932
    #2 0x7fe00ee54ff1 in ruby_sized_xrealloc2 /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:10149
    #3 0x7fe00ee54ff1 in ruby_xrealloc2_body /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:10155
    #4 0x7fe00ee54ff1 in ruby_xrealloc2 /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/gc.c:12024

Direct leak of 73920 byte(s) in 165 object(s) allocated from:
    #0 0x7fe00f23a808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x7fe00ef36d43 in onig_new_with_source /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/re.c:841
    #2 0x7fe00ef36d43 in make_regexp /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/re.c:871
    #3 0x7fe00ef36d43 in rb_reg_initialize /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/re.c:2836

To wipe out those false positive messages, we can create a rake task for testing that takes care of all the settings mentioned above, and filter the final output by excluding blocks that do not contain paths of your project (in our case lexbor):

class ASanTestTask < Rake::TestTask
  def filter_leak_message(output)
    # Discard ruby only leaks (false positives)
    results = output.scan(/(?:Direct|Indirect).*?\n\n/m).select { |r| r.include? 'lexbor' }
    results.join
  end

  def ruby(*args, **options, &block)
    asan_so = `gcc -print-file-name=libasan.so`.strip
    env = {"LD_PRELOAD" => asan_so}
    if args.length > 1
      stdout, stderr, status = Open3.capture3(env, FileUtils::RUBY, *args, **options, &block)
    else
      stdout, stderr, status = Open3.capture3(env, "#{FileUtils::RUBY} #{args.first}", **options, &block)
    end

    puts stdout
    unless (leaks = filter_leak_message(stderr)).empty?
      puts
      puts leaks
      yield false, status
    end
  end
end

namespace :test do
  ASanTestTask.new('asan') do |t|
    t.libs << 'spec'
    t.pattern = 'spec/**/*_spec.rb'
  end
end

Now, simply run rake test:asan. It will run all the tests just like rake test, and the output will only include the leak information related to your code.

# Running:

.........................................................................................................................................................

Finished in 0.124536s, 1228.5576 runs/s, 2505.2939 assertions/s.
153 runs, 312 assertions, 0 failures, 0 errors, 0 skips

Direct leak of 3168 byte(s) in 132 object(s) allocated from:
    #0 0x7f5d01c86a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x7f5cfa550955 in lexbor_calloc ../../../../vendor/lexbor/source/lexbor/ports/posix/lexbor/core/memory.c:29    #2 0x7f5cfa54a1b8 in lexbor_array_create ../../../../vendor/lexbor/source/lexbor/core/array.c:13
    #3 0x7f5cfa4d53b8 in nl_node_at_css ../../../../ext/nokolexbor/nl_node.c:336
    #4 0x7f5d01a29ea0 in vm_call_cfunc_with_frame /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/vm_insnhelper.c:2514
    #5 0x7f5d01a29ea0 in vm_call_cfunc /tmp/ruby-build.20220826185459.18027.EBv81g/ruby-2.7.2/vm_insnhelper.c:2539

In this case, ext/nokolexbor/nl_node.c:336 is causing the memory leak.

That's all for this tutorial. Thanks for reading.