No subject
Wed Apr 23 14:39:25 UTC 2008
It's failing in a function which starts at 0x0000000008064ed3 (directly
after put()). Which function is that? Do "make procps/nmeter.s" and
"objdump -dr procps/nmeter.o" and post both results please, the function
will be visible there.
----------------------------------------------------------------------
nuclearcat - 06-28-08 16:04
----------------------------------------------------------------------
Since i am using 1.11.0 (it is crashing too)
[ 218.330465] nmeter[1733]: segfault at 0 ip 0806488d sp bf9d8680 error 4
in busybox[8048000+76000]
0806486e <collect_info>:
0806489a <collect_time>:
Looks like here:
Disassembly of section .text.collect_info:
00000000 <collect_info>:
0: 53 push %ebx
1: 89 c3 mov %eax,%ebx
3: 83 ec 08 sub $0x8,%esp
6: a1 00 00 00 00 mov 0x0,%eax
7: R_386_32 ptr_to_globals
b: 80 30 01 xorb $0x1,(%eax)
e: eb 14 jmp 24 <collect_info+0x24>
10: 8b 43 08 mov 0x8(%ebx),%eax
13: e8 fc ff ff ff call 14 <collect_info+0x14>
14: R_386_PC32 .text.put
18: 83 ec 0c sub $0xc,%esp
1b: 53 push %ebx
1c: ff 53 04 call *0x4(%ebx)
----> 1f: 8b 1b mov (%ebx),%ebx
21: 83 c4 10 add $0x10,%esp
24: 85 db test %ebx,%ebx
26: 75 e8 jne 10 <collect_info+0x10>
28: 59 pop %ecx
29: 5b pop %ebx
2a: 5b pop %ebx
2b: c3 ret
i manage also to run gdb there
With disabled compiler optimizations all fine, it is not crashing.
If i enable compiler optimization:
(gdb) run nmeter "CPU %c IO %b MEM %[mf]"
Starting program: /home/root/busybox_unstripped nmeter "CPU %c IO %b MEM
%[mf]"
Program received signal SIGSEGV, Segmentation fault.
collect_info (s=0x0) at procps/nmeter.c:753
753 procps/nmeter.c: No such file or directory.
in procps/nmeter.c
while (s) {
put(s->label);
s->collect(s);
s = s->next; <<<--- here
}
}
(gdb) up
http://busybox.net/bugs/view.php?id=1 0x0806543c in nmeter_main (argc=2,
argv=0xbfd6b368) at
procps/nmeter.c:861
861 in procps/nmeter.c
// Generate first samples but do not print them, they're bogus
collect_info(first); <--- here 861
reset_outbuf();
----------------------------------------------------------------------
vda - 06-28-08 16:13
----------------------------------------------------------------------
What does it print when you add this?
while (s) {
put(s->label);
s->collect(s);
bb_error_msg("s:%p s->next:%p", s, s->next);
s = s->next;
}
----------------------------------------------------------------------
nuclearcat - 06-28-08 16:31
----------------------------------------------------------------------
Not able to trigger bug with added line
Output:
Router-Dora ~ # ./busybox_unstripped nmeter "CPU %c IO %b MEM %[mf]"
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
CPU ii........ IO 0 0 MEM 1.9g
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
CPU ii........ IO 0 0 MEM 1.9g
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
CPU ii........ IO 0 0 MEM 1.9g
----------------------------------------------------------------------
vda - 06-28-08 16:38
----------------------------------------------------------------------
We might have uninitialized ->next. I replaced xmalloc's with xzalloc's,
please try attached 3.patch
----------------------------------------------------------------------
nuclearcat - 06-28-08 16:55
----------------------------------------------------------------------
Maybe some gcc optimization causing this?
diff in assembly of nmeter with line added and default
--- VAR1 2008-06-29 02:36:05.000000000 +0300
+++ VAR2 2008-06-29 02:37:24.000000000 +0300
@@ -5,26 +5,19 @@
6: a1 00 00 00 00 mov 0x0,%eax
7: R_386_32 ptr_to_globals
b: 80 30 01 xorb $0x1,(%eax)
- e: eb 24 jmp 34 <collect_info+0x34>
+ e: eb 14 jmp 24 <collect_info+0x24>
10: 8b 43 08 mov 0x8(%ebx),%eax
13: e8 fc ff ff ff call 14 <collect_info+0x14>
14: R_386_PC32 .text.put
18: 83 ec 0c sub $0xc,%esp
1b: 53 push %ebx
1c: ff 53 04 call *0x4(%ebx)
- 1f: 83 c4 0c add $0xc,%esp
- 22: ff 33 pushl (%ebx)
- 24: 53 push %ebx
- 25: 68 0a 00 00 00 push $0xa
- 26: R_386_32 .rodata.str1.1
- 2a: e8 fc ff ff ff call 2b <collect_info+0x2b>
- 2b: R_386_PC32 bb_error_msg
- 2f: 8b 1b mov (%ebx),%ebx
- 31: 83 c4 10 add $0x10,%esp
- 34: 85 db test %ebx,%ebx
- 36: 75 d8 jne 10 <collect_info+0x10>
- 38: 59 pop %ecx
- 39: 5b pop %ebx
- 3a: 5b pop %ebx
- 3b: c3 ret
+ 1f: 8b 1b mov (%ebx),%ebx
+ 21: 83 c4 10 add $0x10,%esp
+ 24: 85 db test %ebx,%ebx
+ 26: 75 e8 jne 10 <collect_info+0x10>
+ 28: 59 pop %ecx
+ 29: 5b pop %ebx
+ 2a: 5b pop %ebx
+ 2b: c3 ret
Disassembly of section .text.collect_time:
If i change -Os to -O0 it works fine
in: Makefile.flags
ifneq ($(CONFIG_DEBUG),y)
CFLAGS += $(call cc-option,-Os,) <<--- here
else
----------------------------------------------------------------------
nuclearcat - 06-28-08 16:53
----------------------------------------------------------------------
Tested patch, it doesn't help.
Crashing in same place.
LINK busybox_unstripped
Trying libraries: crypt m
Library crypt is needed
Library m is needed
Final link with: crypt m
sunfire-1 busybox-1.11.0 # scp busybox_unstripped root at XXX.XXX.XXX.XXX:
root at XXX.XXX.XXX.XXX's password:
busybox_unstripped
100% 565KB 565.3KB/s
00:00
sunfire-1 busybox-1.11.0 # cat procps/nmeter.c|grep xzalloc
SET_PTR_TO_GLOBALS(xzalloc(sizeof(G))); \
s_stat *s = xzalloc(sizeof(s_stat));
cpu_stat *s = xzalloc(sizeof(*s));
s->bar = xzalloc(sz+1);
/*s->bar[sz] = '\0'; - xzalloc did it */
int_stat *s = xzalloc(sizeof(*s));
ctx_stat *s = xzalloc(sizeof(*s));
blk_stat *s = xzalloc(sizeof(*s));
fork_stat *s = xzalloc(sizeof(*s));
if_stat *s = xzalloc(sizeof(*s));
mem_stat *s = xzalloc(sizeof(*s));
swp_stat *s = xzalloc(sizeof(*s));
fd_stat *s = xzalloc(sizeof(*s));
time_stat *s = xzalloc(sizeof(*s));
/*s->next = NULL; - all initXXX funcs use xzalloc
*/
/*s->next = NULL; - all initXXX funcs use xzalloc */
So it is applied, i made sure by rebuilding from scratch.
----------------------------------------------------------------------
nuclearcat - 06-28-08 17:08
----------------------------------------------------------------------
Also if i use gcc-3.4.5-gentoo it is ok too (with -Os even).
When bug triggered i am using 4.1.2-gentoo. I will try rebuild gcc 4.1.2
with vanilla flag, but i am feeling something new added in gcc-4
optimizations.
(Sorry, went to sleep, will be online if required after 8 hours)
----------------------------------------------------------------------
vda - 06-28-08 17:20
----------------------------------------------------------------------
produce nmeter.s with gcc 3.x.x and 4.x.x and attach to the bug please.
----------------------------------------------------------------------
nuclearcat - 06-29-08 06:11
----------------------------------------------------------------------
files attached
binary produced by gcc 4.3.1 crashing too
----------------------------------------------------------------------
vda - 06-29-08 07:31
----------------------------------------------------------------------
Please double check - does this line really makes bug disappear?
while (s) {
put(s->label);
s->collect(s);
bb_error_msg("s:%p s->next:%p", s, s->next);
s = s->next;
}
if so - try simpler modifications, like:
while (s) {
put(s->label);
s->collect(s);
write(2, "before\n", 7);
s = s->next;
write(2, "after\n", 6);
}
What I'm trying to verify by the above fragment - does it _really_ crashes
on s = s->next line? For now, I am not 100% sure.
----------------------------------------------------------------------
nuclearcat - 06-29-08 10:02
----------------------------------------------------------------------
With gcc 4.3.1 it crash on first variant too.
Router-Dora ~ # ./busybox_unstripped nmeter "CPU %c MEM %[mf] IO %b"
nmeter: s:0x80c2078 s->next:0x80c20d8
Segmentation fault
[1198888.680599] busybox_unstrip[6483]: segfault at 0 ip 0806500f sp
bf839c8c error 4 in busybox_unstripped[8048000+76000]
in busybox_unstripped
08064fed <collect_info>:
means collect_info + offset 0x22
then in nmeter.o
00000000 <collect_info>:
0: 53 push %ebx
1: 89 c3 mov %eax,%ebx
3: 83 ec 08 sub $0x8,%esp
6: a1 00 00 00 00 mov 0x0,%eax
7: R_386_32 ptr_to_globals
b: 80 30 01 xorb $0x1,(%eax)
e: eb 24 jmp 34 <collect_info+0x34>
10: 8b 43 08 mov 0x8(%ebx),%eax
13: e8 fc ff ff ff call 14 <collect_info+0x14>
14: R_386_PC32 .text.put
18: 83 ec 0c sub $0xc,%esp
1b: 53 push %ebx
1c: ff 53 04 call *0x4(%ebx)
1f: 83 c4 0c add $0xc,%esp
22: ff 33 pushl (%ebx) THIS?
24: 53 push %ebx
25: 68 83 00 00 00 push $0x83
26: R_386_32 .rodata.str1.1
2a: e8 fc ff ff ff call 2b <collect_info+0x2b>
2b: R_386_PC32 bb_error_msg
2f: 8b 1b mov (%ebx),%ebx
31: 83 c4 10 add $0x10,%esp
34: 85 db test %ebx,%ebx
36: 75 d8 jne 10 <collect_info+0x10>
38: 5b pop %ebx
39: 58 pop %eax
3a: 5b pop %ebx
3b: c3 ret
Disassembly of section .text.nmeter_main:
What i found:
by modifying
static void collect_info(s_stat *s)
{
gen ^= 1;
while (s) {
put(s->label);
bb_error_msg("msg1 label %s",s->label);
bb_error_msg("s:%p s->next:%p", s, s->next);
s->collect(s);
bb_error_msg("msg2");
//bb_error_msg("s:%p s->next:%p", s, s->next);
bb_error_msg("s:%p", s);
s = s->next;
}
}
Router-Dora ~ # ./busybox_unstripped nmeter "CPU %c MEM %[mf] IO %b"
nmeter: msg1 label CPU
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: msg2
nmeter: s:0x80c2078
nmeter: msg1 label MEM
nmeter: s:0x80c20d8 s->next:0x80c20f0
nmeter: msg2
nmeter: s:(nil)
Segmentation fault
so s became NULL while collecting meminfo? sure s->next cannot be
retrieved from it.
My meminfo now
Router-Dora ~ # cat /proc/meminfo
MemTotal: 2076508 kB
MemFree: 1900396 kB
Buffers: 620 kB
Cached: 51900 kB
SwapCached: 0 kB
Active: 12264 kB
Inactive: 43976 kB
HighTotal: 1179584 kB
HighFree: 1121704 kB
LowTotal: 896924 kB
LowFree: 778692 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 3720 kB
Mapped: 3068 kB
Slab: 112572 kB
SReclaimable: 3668 kB
SUnreclaim: 108904 kB
PageTables: 172 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1038252 kB
Committed_AS: 8836 kB
VmallocTotal: 114680 kB
VmallocUsed: 2232 kB
VmallocChunk: 112356 kB
----------------------------------------------------------------------
nuclearcat - 06-29-08 10:21
----------------------------------------------------------------------
I think it is not related to memory, cause as i remember it happens
sometimes with another nmeter parameters.
----------------------------------------------------------------------
nuclearcat - 06-29-08 10:47
----------------------------------------------------------------------
I am wrong.
Well it is related, if we check more my previous tests - it was always
containing %[mf]
----------------------------------------------------------------------
vda - 06-30-08 00:40
----------------------------------------------------------------------
s is a local variable and cannot be accessible by called functions.
In fact, in your case s is in the %ebx register.
1b: 53 push %ebx
1c: ff 53 04 call *0x4(%ebx) s->collect(s)
1f: 83 c4 0c add $0xc,%esp
22: ff 33 pushl (%ebx) <= s->next
24: 53 push %ebx <= s (is in %ebx)
25: 68 83 00 00 00 push $0x83
26: R_386_32 .rodata.str1.1
2a: e8 fc ff ff ff call 2b <collect_info+0x2b>
2b: R_386_PC32 bb_error_msg
More information about the busybox-cvs
mailing list