Significant performance problem with modprobe

Timo Teräs timo.teras at iki.fi
Tue Jun 14 12:25:41 UTC 2011


On 06/14/2011 03:12 PM, Ed W wrote:
> Hi
> 
>>> I do see that module-init-tools appears to do some seeks and so perhaps
>>> doesn't read in the entire file, but my instinct is still to wonder
>>> whether it's the single character read() calls which might be the
>>> majority of the problem?
>>
>> Did you have a strace on these?
> 
> Please see the email from me a few up in this thread (also Denys quoted
> the strace bits in his email also part of this thread).  I can repost
> the strace if it's not easy to find?

Oh, right. Found it. Denys basically said in the beginning what I just said.

We need to implement the binary file format support for busybox. Is that
too hard to believe?

>> Busybox modprobe uses the busybox config_read, which eventually uses
>> fgetc(). FILE * is buffered, and it will do underlying read() syscalls
>> in blocks of some kilobytes.
> 
> Hmm, true.  However, in libbb/get_line_from_file.c I see some notes
> about alternative implementations being perhaps twice as fast (no
> attribution, would need to check source control to see who wrote it?).
> 
> I should have tried to benchmark some changes here, but I got as far as
> trying to figure out where else bb_get_chunk_with_continuation was used
> (not many) and then got somewhat tied up in knots trying to reconcile
> that with the commented out suggestions in that file.  About that point
> was when I first posted and I concede I haven't looked at it since
> 
> For *my* purposes I would always be using glibc or uclibc, and so my
> first thought was to switch to getline(). However, I am not sure of the
> polite way to submit a patch with a conditional on getline being available?

That won't help too much.

>> But busybox implementation *absolutely has* to read all of the file and
>> parse all of it.
>>
>> module-init-tools file format is such, that it only reads few bytes,
>> seeks once, and reads some more bytes which is small fraction of the
>> data it would have to process if it were handling the text files. That's
>> the whole idea of the large binary file: the program knows only to
>> read/parse the part it really needs which is usually under a kilobyte.
> 
> Understood - I am happy to be wrong here, but my device has:
> - Raw CF read speeds of 11MB/s+
> - 500Mhz processor (Alix board with Geode processor)
> - Cached file
> And yet it's looping through that config file at an implied rate of
> 2MB/s ish?  I checked the code and it's processing while it reads, which
> may well explain the extra time, but intuitively feels a touch slow for
> dealing with a 2677 line file?
> 
> What do you think?

There lot of other stuff involved too. Syscalls are expensive too. Bb
implementation ends up doing a lot more syscalls than module-init-tools
as it reads the whole file.

There might be a performance hit if doing fgetc() vs. fgets(). This is
mostly libc implementation detail, but if multithreading is enabled
fgetc() might be doing locking (even if pthreads is not linked in).

You could try swapping the getc() call to getc_unlocked(), which should
be safe here (bb modprobe is single threaded). If there was locking
done, this will probably give considerable performance gain. However, it
will still not likely match the module-init-tools version.

> I'm still hoping there is some low hanging fruit here by looking at a
> more efficient line slurping algorithm?  Do others have better tools
> which can measure the time spent here?

Use:
 valgrind --tool=callgrind <your-command>
 cg_annotate

It'll give you CPU-instruction accurate analysis on where time was used.

- Timo


More information about the busybox mailing list