md5sum

Cathey, Jim jcathey at ciena.com
Fri Jun 19 17:19:03 UTC 2009


>IMHO the right way is to start thinking about a parallel algorithm
using
>the MP option of gcc. The example above has two potential bottlenecks
>1. the read-from-media  2. calc md5
>depending on your system you may have the one or other or both.

Running "md5sum -cs" the first time is 14 seconds, the second
time is 3.5 seconds.  That's a huge I/O difference that obviously
the cache can alleviate, though the cache is completely useless
for a startup integrity check.  (The FS's are JFFS2's over MTD.)
Running "md5sum -csj2" the first time is 9 seconds, the second
time is 2.2 seconds.  -j3 numbers almost exactly the same, not
faster, -j4's starting to go up again.  From that I deduce that
the implementation is already pretty efficient at pegging the
CPU's.  The -j option just bought me 5 seconds which is nice,
but it still represents a net loss of 9 seconds to me since we
currently aren't checking any files for integrity!

It seems to me that if the compiler were able to somehow magically
parallelize the read/calculate functions that at _best_ it could
only cut the execution time (for me) by half (1/#cores) the duration
of the last file's check starting at the point where it became the
lone checking process.  Not likely to be too helpful unless the
file manifest had saved a mongo file to do until near the end.

Or do I misread the situation?  I think the -jN option is generally
more useful, for the following reasons:

1) It's easy to implement, and easy to understand.  Don't discount!
2) The code space impact is small, whereas I'd assume that a massively
   parallelized checker would be quite a bit larger.  (Though it
   probably does use a lot more RAM at run-time, due to the forks.
   Not an issue for how/where I'm using it, whereas consuming more
   flash would be unwelcome whereas the RAM is barely warm yet.)
3) It seems unlikely to me that magic gcc could find all that much
   to parallelize at once.  Whereas -j8 on a suitable system will
   run in 1/8 (not, but you get the point) the time.
4) I can get some control about how much of the system I wish to
   dedicate to the md5sum check, by maybe using -j4 on that mythical
   octo-core system of the prior case.  (Though maybe using nice would
   compensate enough the other way.)

>"Premature optimization is the root of all evil." (Donald Knuth)

Indeed.  And never optimizing (the MS way) isn't far behind!

>just for my curiosity did you try to build your busybox -mtune=native ?

Whazzat?  Why wouldn't that be the default in the build?

-- Jim






More information about the busybox mailing list