Memory leaks

Jump to: navigation, search

It can be difficult to track down memory leaks in Python without using the debug build of Python. This code shows a hack that can be used in many circumstances without having to resort to the debug Python.

This is some code from Chris Siebenmann. This is an imperfect trick. Under certain situations it can miss counting objects in a memory leak. This function is also slow. If you call it in a loop where there is a significant leak it can lock up. It's best to single step through the loop watching a reference count.

I added a little bit to this. I manually call gc.collect() to force garbage collection before each run; otherwise it can be hard to tell the valid object from the garbage ones that have not been collected yet.

Usually I call this in an isolated loop where I think there might be a leak. Typically in a loop like this I assert that the number of object should not grow after the call to the suspect function in question. If the count does grow then I know I have a leak.

while True:
    print "Number objects before:", len(get_all_objects())
    print "Number objects after:", len(get_all_objects())

Here is Chris' slightly modified code:

import gc
# Recursively expand slist's objects
# into olist, using seen to track
# already processed objects.
def _getr(slist, olist, seen):
    for e in slist:
        if id(e) in seen:
        seen[id(e)] = None
        tl = gc.get_referents(e)
        if tl:
            _getr(tl, olist, seen)
# The public function.
def get_all_objects():
    """Return a list of all live Python objects, not including the list itself."""
    gcl = gc.get_objects()
    olist = []
    seen = {}
    # Just in case:
    seen[id(gcl)] = None
    seen[id(olist)] = None
    seen[id(seen)] = None
    # _getr does the real work.
    _getr(gcl, olist, seen)
    return olist