fast getline for busybox (proof of concept, untested)
Rich Felker
dalias at aerifal.cx
Wed May 31 20:55:46 UTC 2006
here is some quick code i whipped up in an attempt to replace
bb_get_chomped_line_from_file et. al. with code that is both portable
and extremely fast. the bb_get_chomped_line_from_file function is
currently the main performance bottleneck for many bb applets due to
repeatedly calling getc rather than using fgets.
the attached proof-of-concept code shows how fgets may be used
repeatedly with realloc to fetch arbitrary-length lines from the
stream, and how it may be used even in the presence of embedded nul
bytes (if anyone cares to support that) with mostly-negligable
performance penalty (assuming the implementation's memset and memchr
are fast).
hopefully the code actually works; it's untested. :)
-rich
p.s. yes the fgets_embedded_nul stuff looks hackish but according to
my reading of iso c it does not depend on any behavior outside of what
iso c specifies for the behavior of fgets. whether there are buggy
implementations that don't work with this, i don't know, but it seems
doubtful since any deviation would almost have to be intentional.
-------------- next part --------------
#include <stdio.h>
#include <string.h>
#include <libbb.h>
size_t fgets_embedded_nul(char *s, size_t n, FILE *f)
{
char *p, *z;
memset(s, 1, n);
if (!fgets(s, n, f)) return 0;
for (p=s; z = memchr(p, 0, n); n -= ++z-p, p=z);
return p-s-1;
}
char *bb_getline_embedded_nul(FILE *f, size_t *n)
{
size_t len = 0, size = 127, l;
char *buf = bb_xmalloc(size);
for (;;) {
if (!(l = fgets_embedded_nul(buf+len, size-len, f))) break;
if (l < size-len-1 || buf[size-2] == '\n') {
len += l;
break;
} else len = size - 1;
// safe: SIZE_MAX allocation will fail before overflow
size = 2*size+1;
buf = bb_xrealloc(buf, size);
}
if (!(*n = len)) {
free(buf);
return NULL;
}
return buf;
}
char *bb_getline(FILE *f)
{
size_t len = 0, size = 127;
char *buf = bb_xmalloc(size);
for (;;) {
buf[size-1] = 1;
if (!fgets(buf+len, size-len, f)) break;
if (buf[size-1] || buf[size-2] == '\n') {
len += strlen(buf+len);
break;
} else len = size - 1;
// safe: SIZE_MAX allocation will fail before overflow
size = 2*size+1;
buf = bb_xrealloc(buf, size);
}
if (!len) {
free(buf);
return NULL;
}
return buf;
}
More information about the busybox
mailing list