[PATCH] diff: rewrite V4. -1633 bytes

Denys Vlasenko vda.linux at googlemail.com
Mon Jan 18 04:02:49 UTC 2010


On Monday 18 January 2010 03:44, Matheus Izvekov wrote:
> >From e862a7c8d14dd511478c3a1dbbe2d48118200501 Mon Sep 17 00:00:00 2001
> From: Matheus Izvekov <mizvekov at gmail.com>
> Date: Mon, 18 Jan 2010 00:32:28 -0200
> Subject: [PATCH] diff: rewrite V4. -1633 bytes
> 
> This is an almost complete rewrite of diff, and fixes various bugs, including:
> * How -b and -w flags work.
> * The exit status when -q flag is set.
> * Context is always taken from old file, instead of a mix of both.
> * Now -S flag behaves exactly how it does on gnu diff. It does not error
>   out in case it would make the list empty.
> * Fixes to a ton of memory leaks. No leaks that I am aware of now.
> * Diff against directories does not care about trailing slashes, exactly like how gnu diff does it.
> 
> FEATURE_DIFF_MINIMAL and FEATURE_DIFF_BINARY are removed, because
> disabling them was not giving enough savings to be worth it.
> 
> Printing of modification time after the filename is removed, as this feature was deemed frivolous.
> There shouldn't be any compatibility issues with gnu diff, as date was displayed in a different format anyway.
> Please address me with any concerns about this.
> 
> Performance is still worse than original diff, but by only ~40% worst case now.
> Fix to excessive and redundant seeking was implemented, in a more economical and less invasive way than
> how it was done in Denys' last patch.

I don't know, after re-adding my lseek elimination code I still
see 6 sec -> 2.1 sec improvement in system time:

# time /usr/srcdevel/bbox/fix/busybox.t1/busybox diff -aurpN linux-2.6.32.3 linux-2.6.33-rc4 >z1.diff
real    0m28.434s
user    0m22.450s
sys     0m5.960s

# time /usr/srcdevel/bbox/fix/busybox.t2/busybox diff -aurpN linux-2.6.32.3 linux-2.6.33-rc4 >z2.diff
real    0m23.285s
user    0m21.158s
sys     0m2.124s

> Fixes were also commited against various problems with the directory handling code.
> Now diff between linux sources 2.6.32.3 and 2.6.33-rc4 produces exactly the same output as the old diff code.

Applied, thanks!

> Getting more performance is going to cost code size, so I need a position about if this is good enough.

This would be desirable. We may swipe it under a CONFIG_FEATURE_DIFF_FAST if we want.
--
vda


More information about the busybox mailing list