Tracking Down Memory Leaks in Python


This Page Mostly Obsolete

Most of the stuff on this page is obsolete nowadays, since Python (since version 2.0, I think) now includes a cyclical-reference garbage collector. Leaks like this are still possible, but usually only because an older Python extension module (implementing a container type) is used, one that doesn't adhere to the new GC API.


Long-running processes have a nasty habit of exposing Python's Achilles' Heel: Memory Leaks created by cycles. (objects that point at each other, either directly or circuitously). Reference counting cannot collect cycles. Here's one way to create a cycle:

class thing:
    pass

                      refcount(a)        refcount(b)
a = thing()               1
b = thing()               1                  1
a.other = b               1                  2
b.other = a               2                  2

del a                     1                  2
del b                     1                  1

    

Objects a and b have become immortal.

Large and complex systems may create non-obvious cycles. Here are a few quick hints to avoid various ones that I've run into:

Notes

An interface to malloc_stats() [Linux]
Samual M. Rushing
Last modified: Tue May 2 15:45:00 PDT 2006