Memory Leaks 34
G3ck0G33k writes: "Is there any free software version/clone of Rational's programs PureCoverage and/or Purify? I have worked with both of them on fairly large projects (>150,000 lines of code) and they were great to work with. When the first runs of Purify found nearly fifty instances of minor memory leaks, I was deeply frustrated/impressed. A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget. Of course, the more kinds of leaks it may detect, the better. GeckoGeek" We had a similar question last year but there's no harm in seeing what the current answers are.
Bounded pointers, etc. (Score:5, Informative)
Bad news: You'll have to build your own gcc (Greg's changes haven't yet been accepted in to the gcc trunk), and all your libraries (just as Purify re-writes all your libraries).
Good news: The resulting code is much faster than Purify'ed code, and finds some problems Purify doesn't. I know of a major software development effort (hundreds of developers, millions of lines of code; sorry, can't give details) that uses bounded pointers to great advantage.
Other tools: GNU Checker, dbmalloc, Bruce Perens' Electric Fence, MemProf, mpatrol, and Mprof; Google searches will turn them all up.
Re:Bounded pointers, etc. (Score:1, Interesting)
Re:Bounded pointers, etc. (Score:1)
http://public.support.unisys.com/aseries/docs/HMPN X05_SSR461_SSP4/PDF/70126610.PDF
Unfortunately, that doucment is quite dense (and you're going to have to remove the lameness filter modifications). The A-Series actually uses a structure called an ASD (actual segment descriptor) to store information about the base address, length and type of data in the array, among other things. Of course, the processor can take a look at that data in parallel with accessing data in the array (and throw an exception before committing any changed data), so it has almost no performance cost (aside from reading the ASD, which is probably on par with the cost of loading the array length into a register at the beginning of a loop).
More food for thought: the architecture also has additional "tag" bits on every data word. These give some primitive type information (e.g. code, single-precision real, array element, ASD, etc...) . The processor will not allow a program to arbitrarily change data in a code segment, or things such as return addresses on your stack. I don't know if there are any other machines around today that still have this attribute (if anyone knows of some, please post!). For example, it makes a lot of the recent buffer overflow attacks that we see a moot point, since a string transfer operator would not be allowed to touch the stack frame!
Memory leak detection (Score:5, Informative)
The Boehm-Weiser garbage-collecting malloc() can be built in a leak-detection mode. Every time an object is leaked, it prints out the address of the memory in question. Do that. Then it's 15 lines of python to correlate that back with the malloc() calls; I wrapped malloc/realloc to print out the line number and filename, e.g.
void *our_malloc(size_t howbig, int line, char * file)
{
void *p;
p=GC_malloc(howbig);
fprintf(stderr, "Line %d of %s/%s(): %p\n", line, file, p);
return p;
}
#define malloc(x) our_malloc(x, __LINE__, __FILE__)
with similar for realloc (and make free do GC_free).
Then run the proggy, redirecting stderr through a simple python script: (leading spaces have been replaced with underscores since slashdot doesn't do PRE)
import sys
a={}
for line in sys.stdin.readlines():
__line=line.strip()
__num=line[line.find("0x"):]
__try:
____num=num[0: num.index(" ")]
__except:
____pass
__if line[1]=="i":
____a[num]=line
__else:
____print "Leaked object: "+a[num]
When I run my program this way I get the following output:
Leaked object: Line 43 of leak_stuff.c/(): 0x806efe0
Leaked object: Line 43 of leak_stuff.c/(): 0x806eff0
Leaked object: Line 55 of leak_stuff.c/(): 0x806dfd8
Which tells me which lines to look for the initial allocations of leaked objects at.
The garbage-collecting malloc is really cool; it's at:
http://www.hpl.hp.com/personal/Hans_Boehm/gc/
for now, but rumor has it that gcc will become the official source for it at some point (it's needed for the Java compiler).
Sumner
Re:Memory leak detection (Score:1)
this is all fine and dandy, but what about closed source, stripped 3rd party libraries?
i'm using closed source libraries in a multi-million line project, and I think they have a memory leak.
I cant wrap around malloc in there code, 'cos I dont have it. I call functions like FMLAdd() and it all happens magically.
Re:Memory leak detection (Score:1)
You can use LD_PRELOAD to wrap malloc, assuming they're dynamically linked against libc (almost definitely). If they use GNU libc and don't dynamically link, they're required by the LGPL to distribute object files so you can relink against your own libc.
If you don't have the source, fixing a leak is tough but you can rebuild the garbage-collecting malloc in a redirect mode so their app uses it instead of libc's malloc. Then LD_PRELOAD it. I used to do this with netscape-communicator back when it leaked like mad; worked great, though as I mentioned there is a chance that gcc's optimizations could confuse the gc. In practice it seemed to work okay, for any app where a very rare crash isn't the end of the world (netscape crashed all the time anyway) and where the app is already leaking anyway, it's worth a try.
Sumner
Re:Memory leak detection (Score:2)
Re:Memory leak detection (Score:3, Informative)
C compilers may not hide pointers in the generated object code. In our experience, standard commercial compilers obey this restriction in unoptimized code. Most aggressive optimizing compilers do not obey this restriction for all optimized code. For details and examples see papers/pldi96.ps.gz. However, it is difficult to construct examples for which they violate it, especially for single-threaded code. In our experience, the only examples we have found of a failure with the current collector, even in multi-threaded code, were contrived.
However, the gcc developers claim the gcc does in fact violate this constraint. So using Boehm gc with gcc may not be safe in production code. The gcc mailing list has had a couple of threads on how to make gcc garbage-collector friendly in the future (once again, Java is one impetus for this). Until then, I'd stick to manual mm and use the gc only to help find leaks.
Sumner
Re:Memory leak detection (Score:1)
(I finish things at work to my manager's satisfaction. At home, I finish things to my satisfaction, and I'm never satisfied.)
Re:Memory leak detection (Score:4, Informative)
license.
Compiles and runs out of the box on an alpha
running Linux.
GUI? uh no. It has a nifty command line utility to control logging etc...
ccmalloc (Score:1, Informative)
Here's a quick and easy solution (Score:2, Informative)
dmalloc for memory debugging (Score:3, Informative)
I like dmalloc [dmalloc.com] for memory debugging. It even found a memory bug for a program that purify choked on. It doesn't have a GUI.
use C++ (Score:2, Interesting)
I switched from C to C++ basically because I couldn't get Purify for Linux. C++ has allowed me to adopt clear, well-defined memory management strategies and automate various pointer checks. I hardly ever get memory leaks or pointer errors in my C++ code anymore.
But no matter what you do in your own code, if you are using C or C++, you will always be exposed to numerous pointer bugs and leaks in library code. Most real-world C++ code commits the same memory allocation sins and has the same pointer bugs as real-world C code--people aren't taking sufficient advantage of C++'s smart pointer facilities (even STL is flawed in that way). Therefore, for multiprogrammer projects, I wouldn't use anything but Java or another safe language anymore.
Probably a bit late for existing projects but... (Score:2, Interesting)
get_mem(ptr, size, "widget hash table")
When debugging, get_mem keeps track of all allocs. At the end, just before the program shuts down the heap dump routine is called which lists all outstanding memory blocks along with the debug string so you can see where they were allocated.
It's also often practical to call the dump routine at various points within the program and give the output a quick look-over or diff - it's amusing how often you can nip these problems in the bud this way.
Also, if you get really desparate, change the get_mem routine to increment a global counter and tag that to the end of each allocation info block. If you keep a program debug log and log each allocation it makes it easy to see where a loose block was allocated - grab the unique ID from the dump and search the log file for it.
A handy feature about this trick is that you use #define to define get_mem, so when you go to production you simply define it to malloc and throw the debug string away - no speed or size cost in the running program. In addition, it basically costs nothing except an hour or so to set it up in the first place. The catch is you have to use it religiously from the start of your project.
A really simple trick, but it has saved me so much work!
Roll your own (Score:2, Interesting)
Re:Roll your own (Score:2, Funny)
Your kung fu is no good, Anonymous Coward.
Free Beer? (Score:2, Interesting)
A free (perhaps GPLd) clone would be so interesting; Rational's licensing is killing my current budget.
Maybe you should put a developer or two on that project and see how long it takes them to build something similar. I think Purify runs about $1,500 now (could be wrong). That's what, two Aeron chairs? That shouldn't kill any real company's budget. Numega's Boundschecker is a viable cheaper alternative though. Or just rip off the free trial versions.
When I've seen Purify bought, a developer downloaded the trial and built a list of all the problems he found and fixed using it. When he showed his manager how much pain and suffering the product could save it was an easy sell. (The hardest part was countering the "so everything's fixed already?" mentality.)
MEMPROF (Score:3, Informative)
Its not purify (it really aims for leak detection, not all the other errors purify finds), but the efence + memprof combination gets you about 85% of purify's functionality.
It seems to handle threaded apps reasonably well, and C++ doesn't faze it. The only down side is that its hard to get running on non-x86 platforms.
mpatrol works great.... (Score:1)
I asked REdhat once.. (Score:1, Redundant)
But to answer the question are there any out there? NO, not with pretty GUIs and all.
Re:I asked REdhat once.. (Score:2, Informative)
Checker (Score:2, Informative)
mpatrol (Score:2, Informative)
It can:
- log your memory usage
- report on improper memory usage
- profile your memory usage
- work with your applications *without* re-linking (assuming your OS allows this)
The web page is at:
http://www.cbmamiga.demon.co.uk/mpatrol/
In addition, the author has excellent documentation. The pdf manual actually has a section that lists competing products and what they do.
http://www.cbmamiga.demon.co.uk/mpatrol/files/m