memory leaking

Sat Jun 17 14:47:57 UTC 2006

On Friday 16 June 2006 8:23 pm, Erik Hovland wrote:
> The config item 'FEATURE_CLEAN_UP' says :
> ---------------------
> As a size optimization, busybox normally exits without explicitly
> freeing dynamically allocated memory or closing files.  This saves
> space since the OS will clean up for us, but it can confuse debuggers
> like valgrind, which report tons of memory and resource leaks.
>
> Don't enable this unless you have a really good reason to clean
> things up manually.
> ----------------------
>
> That is fine and I generally agree. But there is another trade off
> associated to this and that is not freeing memory could cause failure of
> applications in corner cases because it runs out of memory. I don't have
> specific examples or test cases of this. So I am trying to begin some
> sort of discussion.

That's why we only make certain frees optional.

Some frees aren't optional.  Each line of sed input has to be freed because 
you don't know how long the file you're sedding is.  Daemons have to free 
each network request they handle because they expect to run for a long time.  
For ls -lR, you'd better free directory contents because the filesystem could 
be way larger than available memory.

But for something like chroot, echo, pwd...  It's not running long enough to 
care.  Its memory usage profile is that it hits a peak early on and then uses 
whatever it has until it's ready to exit.  At which point the OS can free it.

And things like netcat, which allocate a scratch buffer and need that buffer 
as long as they're running; they can free it on the way out but the OS will 
do it for them if they don't.

> There are at least two cases where I can think this is important. One
> example is a daemon, which might be long lived.

We honestly have done this before, you know.  We've been doing it for a while.

> In this situation if an 
> exit is not in the near future a function should free any memory it
> allocates and is finished using. The other is where something reports
> over input like less or patch.

What you're looking for is unbounded input size.

Sed is an easy corner case because the command list it works on is finite in 
size but the input data isn't.  We don't know how big the sed command list 
will be, but we can't free it until we're about to exit anyway.  So that's an 
atexit() guarded by FEATURE_CLEAN_UP.  But the input data could be gigabytes, 
or sed running in a pipe on a log file that lasts weeks.  So that's 
unconditionally freed.

> Should all resource leak fixing be wrapped in
> some variation of FEATURE_CLEAN_UP? Or is there a rule of thumb
> regarding when to use this option and when not to?

The rule of thumb is "Understand what you're doing".  It always is.  Is it ok 
to not free this and let the OS do it for us, or would that potentially cause 
a problem?

> If so, do Bernhard and/or Rob prefer the #ifdef
> CONFIG_FEATURE_CLEAN_UP/#endif style or the if (ENABLE_FEATURE_CLEAN_UP)
> style? Or does it matter? Examples of both are all over the code.

The if (ENABLE_FEATURE_CLEAN_UP) style is newer, and much preferred.  Since 
ENABLE_FEATURE_CLEAN_UP is a #defined constant, that becomes if (0) and the 
compiler snips out the code for us via dead code elimination, so it doesn't 
wind up in the resulting binary.

The problem with #ifdef, in addition to being unreadable (just figuring out 
what the indentation level should be is a nightmare), also means you have 
config-specific build breaks because some code just doesn't get compiled 
unless certain features are enabled or disabled.  With the if(ENABLE) style, 
the code is always compiled and syntax checked, so if they're something wrong 
with it it should show up in any configuration.

If you need a small #ifdef, we have the USE() macros, like 
USE_FEATURE_CLEAN_UP("1234") that evaluate either to nothing if it's 
disabled, or to their argument if it's enabled.  (Those are also 
automatically generated.)

In theory the only #ifdefs we should still need are around entire functions.  
In practice, there's still some trickiness because right now the code has 
#ifdefs around the declaration of static variables, and that doesn't mix with 
the if(ENABLE) style.  I have a plan to fix that, but as maintainer fielding 
other people's patches has to take priority over writing my own, which means 
my personal TODO items get delayed a lot...

Rob
-- 
Never bet against the cheap plastic solution.