memory leaks in ubuntu: episode I, detection

leaky plumbing?

An important piece of optimizing the Ubuntu core on the Nexus 7 is slimming down Ubuntu’s memory requirements. It turns out this focus area has plenty of opportunity to help contribute, and today, I’ll talk about how to find memory leaks in an individual application using valgrind.

The best part? You don’t even have to be a developer to help. The second best part? You don’t even need a Nexus 7! What I describe below works on any Ubuntu machine. Let’s get started!

The first step is to find an application to profile. This is the easiest step. Maybe you have an app you use all the time and really care about making it perform as well as possible. Or maybe you’re experiencing a strange behavior problem in an app that takes a little while to show up. Or maybe you just pick a random application from the dash because you’re in a great mood. They’re all good.

In my case, I’ll use nm-applet as my example, since I’ve been struggling with LP: #780602 for a while, where the list of wifi access points would stop displaying after a day or two. TrĂ©s annoying!

Next, install valgrind if it is not already installed.

sudo apt-get install valgrind

Pay attention to the next bit because it is important. In order for your valgrind report to be as helpful as possible for developers, you will also need to install debug packages related to your app. The debug packages contain information to help developers narrow in on exactly where problems might be. “Great” you say, “what do I need to install?”

UPDATE: 29 January 2013

After a bit more thinking and discussing with smart folks like infinity, xnox, and pitti, we realized that I was essentially reinventing a lot of code that already exists in apport-retrace, as that tool already knows how to go from a binary to a package and then solve dependencies.

I tossed the idea (and a really rough crappy version of a prototype) to Kyle Nitzsche who took the idea, ran with it, and fixed all my crap! Woo hoo! With a little bit of effort, we ended up with apport-valgrind which has already landed in raring (along with the required valgrind support patch). Even better, Kyle wrote a great apport-valgrind introduction explaining how it works.

So ignore the script below and use apport-valgrind instead (unfortunately only available in raring).

Today is your lucky day because I’ve written a small script to help you figure out which debug packages you’ll need. Go ahead and grab the python version of valgrind-ubuntu-dbg-packages. (Ignore the go version for now, that’s just something I’m playing with in my other spare time!)

Ok, now comes the tricky part. We have to do a quick valgrind run to see what libraries your app uses. Then we’ll use the helper script to see if there are debug packages for those libraries. Ready?

To run valgrind, use this command:

G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind -v --tool=memcheck --leak-check=full --num-callers=40 --log-file=valgrind.log --track-origins=yes 

Replace with the name of your app.

Let this run long enough for your app to launch (which may take a while under valgrind) and then play with your app just a bit where you would reproduce your bug but without actually reproducing the bug. In the case of nm-applet, I did the following sequence:

killall nm-applet	# stop earlier instances of nm-applet

G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind -v --tool=memcheck --leak-check=full --num-callers=40 --log-file=valgrind.log --track-origins=yes nm-applet

Then I clicked the “More networks” menu item in the applet just to get it to display the other wifi access points, since this is the thing that was breaking for me. After doing that just once, I stopped my valgrind run completely by pressing control-c in the terminal where I launched it.

A valgrind log file should now exist, and you can run the helper script on the log:

./ valgrind.log

You will see quite a bit of output, but at the end, you will get a list of recommended extra packages to install.

It is recommended to install the following packages:
libnss3-dbg libdbus-glib-1-2-dbg libdconf-dbg gvfs-dbg libcanberra-gtk3-module-dbg libatk1.0-dbg librsvg2-dbg libfontconfig1-dbg

Go ahead and install the packages.

Now we are finally ready to collect our real logs.

Update: 29 January 2013

Instead of doing all that janky stuff above, just:

  1. apt-get install apport-valgrind
  2. run: apport-valgrind <executable>
  3. Do step 2 for as long as it takes to reproduce the bug. There is no step 3!

Re-run valgrind exactly as above, but this time, let the app run as long as it needs to reproduce the bug. In the case of nm-applet, I had to let it just sit there and run normally for 24 hours before I saw the bug again. Hopefully your bug reproduces faster! Patience is key. I recommend eating a delicious sandwich if you can’t think of anything better to do.

After your bug has reproduced itself, kill the valgrind run. File a bug — you can use the Ubuntu Nexus7 project — and be sure to attach the valgrind log. It would also be great if you could describe how you reproduced the bug. Be sure to read the bug filing guidelines for more detail.

Huzzah, you’ve contributed something extremely valuable to making Ubuntu leaner and meaner — a great log file. With any luck, a developer will be able to pick up your bug and fix the problem.

And… if we’re even luckier, maybe that developer will be you! Next time I’ll show you how to actually analyze the valgrind log. Stay tuned.