[BusyBox 0000984]: Split missing from busybox
bugs at busybox.net
bugs at busybox.net
Mon Mar 26 18:33:18 UTC 2007
A NOTE has been added to this issue.
======================================================================
http://busybox.net/bugs/view.php?id=984
======================================================================
Reported By: Sadara
Assigned To: BusyBox
======================================================================
Project: BusyBox
Issue ID: 984
Category: New Features
Reproducibility: N/A
Severity: feature
Priority: normal
Status: assigned
======================================================================
Date Submitted: 08-05-2006 22:53 PDT
Last Modified: 03-26-2007 11:33 PDT
======================================================================
Summary: Split missing from busybox
Description:
Coreutils split is currently not included in busybox.
Would it be possible to provide a cutdown applet with the most commonly
used features?
======================================================================
----------------------------------------------------------------------
bernhardf - 08-17-06 02:28
----------------------------------------------------------------------
It certainly can be added.
What are the most commonly used features of split?
----------------------------------------------------------------------
Sadara - 08-28-06 22:21
----------------------------------------------------------------------
IMO:
-b/--bytes=SIZE
-l/--lines=LINES
Would also be nice if SIZE could have a multiplier suffix(b,k,m,g) as
several scripts floating around use it, but needed scripts could be
modified.
----------------------------------------------------------------------
sega32x - 02-07-07 14:26
----------------------------------------------------------------------
Just curious, has there been any progress with adding this in the past few
months? Havent seen it on SVN, and this would make busybox complete
(atleast for me!)
----------------------------------------------------------------------
jkolb - 03-23-07 11:06
----------------------------------------------------------------------
Patch against 1.4.1 implementing -a, -b, -d, and -l options. Fancy
configuration supports k, b, m suffixes on sizes. It's not too pretty,
but it Works For Me(tm).
----------------------------------------------------------------------
vda - 03-24-07 08:17
----------------------------------------------------------------------
Code review:
FILE *open_next_file(char **filename)
{
...
return fopen(*filename, "wb");
}
void bytes_split(FILE *infile)
{
...
do {
if (bytes_remaining <= 0) {
if (outfile) fclose(outfile);
outfile = open_next_file(&filename);
bytes_remaining = split_size;
}
to_write = ...
bytes_written = fwrite(src, 1, to_write, outfile);
What will happen if fopen() failed (returned NULL)?
opt_complementary = "b--l:l--b";
opt = getopt32(argc, argv, "a:b:dl:", &new_suffix_size,
&new_split_size, &new_split_size);
if(opt & BB_GETOPT_ERROR)
bb_show_usage();
Just add "?:" to opt_complementary and you can get rid of BB_GETOPT_ERROR
check then.
Also:
* outfile is only used for fwrite -> maybe use 'simple' I/O
(non-stdio-based), open(), not fopen()?
* use xatoi instead of atoi
* many variables and functions can be made static
* space/tab indentation is mixed. Use tabs only
* bb_error_msg("should start with small letter")
----------------------------------------------------------------------
jkolb - 03-24-07 16:32
----------------------------------------------------------------------
I think I've addressed your issues, and also added 'g' suffix support, as
well as making the applet return an error when a write error occurs.
----------------------------------------------------------------------
bernhardf - 03-25-07 14:48
----------------------------------------------------------------------
Please post your
$ size coreutils/split.o*
output.
Mine (on trunk now) is:
text data bss dec hex filename
602 4 0 606 25e coreutils/split.o
and is still bloated as hell..
----------------------------------------------------------------------
vda - 03-25-07 14:55
----------------------------------------------------------------------
2.patch is mine.
w/fancy prefixes
# size split.o
text data bss dec hex filename
820 20 4 844 34c split.o
w/o prefixes
# size split.o
text data bss dec hex filename
765 20 4 789 315 split.o
bernhard's smaller :(
----------------------------------------------------------------------
vda - 03-25-07 14:59
----------------------------------------------------------------------
Bernhard, on most CPUs
char *count_p = NULL;
getopt32(argc, argv, "l:b:a:", &count_p, &count_p, &sfx_len);
if (count_p) ...
is bigger than
char *count_p;
getopt32(argc, argv, "l:b:a:", &count_p, &count_p, &sfx_len);
if (opt & (1|2)) ...
because (one store + one check to NULL) > (one '&')
----------------------------------------------------------------------
rockeychu - 03-25-07 18:48
----------------------------------------------------------------------
"split" could create a 0 byte file when INPUT_FILE_SIZE = N *
OUTPUT_FILE_SIZE.
Patch as following:
Index: coreutils/split.c
===================================================================
--- coreutils/split.c (revision 18237)
+++ coreutils/split.c (working copy)
@@ -77,16 +77,17 @@
int inp = xopen(input_file, O_RDONLY);
int flags = O_WRONLY | O_CREAT | O_TRUNC;
do {
- int out = xopen(pfx, flags);
buf = xzalloc(cnt);
lseek(inp, bytes, SEEK_SET);
bytes += i = full_read(inp, buf, cnt);
+ if (i <= 0 ) break;
+ int out = xopen(pfx, flags);
xwrite(out, buf, i);
close(out);
free(buf);
if (next_file(&pfx))
flags = O_WRONLY | O_APPEND;
- } while(i > 0);
+ } while(1);
} else { /* -l */
FILE *fp = fopen_or_warn_stdin(input_file);
char *buf;
----------------------------------------------------------------------
bernhardf - 03-26-07 07:29
----------------------------------------------------------------------
rockeychu, please update your checkout of trunk and retest (and descibe a
brief testcase if it still fails).
TIA
----------------------------------------------------------------------
jkolb - 03-26-07 11:15
----------------------------------------------------------------------
The way the byte split is now implemented will not work if the chunk sizes
are large. One of the main reasons I want split is to break large files
into 1-2G chunks for storage on fat32 drives.
----------------------------------------------------------------------
bernhardf - 03-26-07 11:33
----------------------------------------------------------------------
I see. We should unify -l and -b anyway. Could use memchr with
common_bufiz1 for the l case and iterate over common_bufsiz1 for b. That
would avoid allocating the read buffer. Sounds like using both the
toplevel Makefile and TODO for testing was not a good idea of mine ;)
Feel like implementing that?
Issue History
Date Modified Username Field Change
======================================================================
08-05-06 22:53 Sadara New Issue
08-05-06 22:53 Sadara Status new => assigned
08-05-06 22:53 Sadara Assigned To => BusyBox
08-17-06 02:28 bernhardf Note Added: 0001568
08-28-06 22:21 Sadara Note Added: 0001598
02-07-07 14:26 sega32x Note Added: 0002137
03-23-07 11:05 jkolb File Added: busybox-1.4.1-split.patch
03-23-07 11:06 jkolb Note Added: 0002273
03-24-07 08:17 vda Note Added: 0002274
03-24-07 16:29 jkolb File Added: busybox-1.4.1-split.2.patch
03-24-07 16:32 jkolb Note Added: 0002278
03-25-07 14:48 bernhardf Note Added: 0002280
03-25-07 14:48 vda File Added: 2.patch
03-25-07 14:55 vda Note Added: 0002281
03-25-07 14:59 vda Note Added: 0002282
03-25-07 18:48 rockeychu Note Added: 0002283
03-26-07 07:29 bernhardf Note Added: 0002284
03-26-07 11:15 jkolb Note Added: 0002285
03-26-07 11:25 jkolb File Added: fancy_d.patch
03-26-07 11:33 bernhardf Note Added: 0002286
======================================================================
More information about the busybox-cvs
mailing list