[PATCH 1/2] find: use sysconf(_SC_ARG_MAX) to determine the command-line size limit

Bartosz Gołaszewski bartekgola at gmail.com
Mon Jun 23 13:17:59 UTC 2014


2014-06-22 13:49 GMT+02:00 Denys Vlasenko <vda.linux at googlemail.com>:
> On Thursday 19 June 2014 22:42, Bartosz Golaszewski wrote:
>> The find utility uses a hardcoded value of 32 * 1024 as the limit of
>> the command-line length when calling 'find -exec ... {} +'. This results
>> in over 4 times more execve() calls than in coreutils' find.
>>
>> This patch proposes to use the limit defined in system headers.
>
>
>> +     IF_FEATURE_FIND_EXEC_PLUS(G.max_argv_len = BB_ARG_MAX;) \
>
> The "- 2048" part should be here, not inside BB_ARG_MAX
> (not all users will want that subtraction).

But then the user would substract these bytes even when using ARG_MAX
or 32*1024 - when it's not needed as ARG_MAX already includes space
for the environment.

>> +
>> +     /* TODO Introduce a growable buffer and use BB_ARG_MAX macro to
>> +      * determine a safe value for n_max_chars. Remove this check afterwards.
>> +      */
>
> I don't understand this comment.

Before commit f92f1d0 there was a comment in findutils/xargs.c:

/* -s NUM default. fileutils-4.4.2 uses 128k, but I heasitate
 * to use such a big value - first need to change code to use
 * growable buffer instead of fixed one.

The small value of 32*1024 is used here because the command-line
buffer is allocated in a single kzalloc() call at line 559. This is
why we do ' n_max_chars = 32 * 1024;'  at line 536 if the value
returned by bb_arg_max() is greater than 32*1024, and - for now -
'n_max_chars -= 2048;' at line 541 is not needed.

If we were to use bb_arg_max() we'd need to use a growable vector just
like 'find -exec {} +' does right now.

>> +/* Include this for ARG_MAX on linux. */
>> +#ifdef __linux__
>> +# include <linux/limits.h>
>> +#endif
>
> libbb.h already inludes limits.h
> "man sysconf" says that you need unistd.h, not limits.h,
> but it is included too.

If I understand correctly, what "man sysconf" is saying is that you
need <unistd.h> for _SC_ARG_MAX and ARG_MAX is just the POSIX
equivalent. I've checked on three different systems and it's always
defined in <linux/limits.h> (not <limits.h>!).

In the following code in include/libbb.h (current git):
734 #if defined ARG_MAX
735 # define bb_arg_max() ((unsigned)ARG_MAX)
736 #elif defined _SC_ARG_MAX
737 unsigned bb_arg_max(void) FAST_FUNC;
738 #else
739 # define bb_arg_max() ((unsigned)(32 * 1024))
740 #endif
741 unsigned bb_clk_tck(void) FAST_FUNC;

Line 735 will never be compiled unless we include <linux/limits.h>.
Also: I would make sysconf(_SC_ARG_MAX) the default expansion of
bb_arg_max() as it's usually greater than ARG_MAX -> less execve()
calls.

Please consider merging the patches in attachment.

Bartosz Gołaszewski
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-xargs-don-t-substract-2048-bytes-from-the-return-val.patch
Type: application/octet-stream
Size: 1373 bytes
Desc: not available
URL: <http://lists.busybox.net/pipermail/busybox/attachments/20140623/09dc689b/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-bb_arg_max-include-linux-limits.h-for-ARG_MAX-defini.patch
Type: application/octet-stream
Size: 1683 bytes
Desc: not available
URL: <http://lists.busybox.net/pipermail/busybox/attachments/20140623/09dc689b/attachment-0001.obj>


More information about the busybox mailing list