No subject


Wed Apr 23 14:39:25 UTC 2008


It's failing in a function which starts at 0x0000000008064ed3 (directly
after put()). Which function is that? Do "make procps/nmeter.s" and
"objdump -dr procps/nmeter.o" and post both results please, the function
will be visible there. 

---------------------------------------------------------------------- 
 nuclearcat - 06-28-08 16:04  
---------------------------------------------------------------------- 
Since i am using 1.11.0 (it is crashing too)
[  218.330465] nmeter[1733]: segfault at 0 ip 0806488d sp bf9d8680 error 4
in busybox[8048000+76000]

0806486e <collect_info>:
0806489a <collect_time>:

Looks like here:
Disassembly of section .text.collect_info:

00000000 <collect_info>:
   0:   53                      push   %ebx
   1:   89 c3                   mov    %eax,%ebx
   3:   83 ec 08                sub    $0x8,%esp
   6:   a1 00 00 00 00          mov    0x0,%eax
                        7: R_386_32     ptr_to_globals
   b:   80 30 01                xorb   $0x1,(%eax)
   e:   eb 14                   jmp    24 <collect_info+0x24>
  10:   8b 43 08                mov    0x8(%ebx),%eax
  13:   e8 fc ff ff ff          call   14 <collect_info+0x14>
                        14: R_386_PC32  .text.put
  18:   83 ec 0c                sub    $0xc,%esp
  1b:   53                      push   %ebx
  1c:   ff 53 04                call   *0x4(%ebx)
---->  1f:   8b 1b                   mov    (%ebx),%ebx
  21:   83 c4 10                add    $0x10,%esp
  24:   85 db                   test   %ebx,%ebx
  26:   75 e8                   jne    10 <collect_info+0x10>
  28:   59                      pop    %ecx
  29:   5b                      pop    %ebx
  2a:   5b                      pop    %ebx
  2b:   c3                      ret

i manage also to run gdb there
With disabled compiler optimizations all fine, it is not crashing.

If i enable compiler optimization:

(gdb) run nmeter "CPU %c IO %b MEM %[mf]"
Starting program: /home/root/busybox_unstripped nmeter "CPU %c IO %b MEM
%[mf]"

Program received signal SIGSEGV, Segmentation fault.
collect_info (s=0x0) at procps/nmeter.c:753
753     procps/nmeter.c: No such file or directory.
        in procps/nmeter.c

        while (s) {
                put(s->label);
                s->collect(s);
                s = s->next; <<<--- here
        }
}


(gdb) up
http://busybox.net/bugs/view.php?id=1  0x0806543c in nmeter_main (argc=2,
argv=0xbfd6b368) at
procps/nmeter.c:861
861     in procps/nmeter.c

        // Generate first samples but do not print them, they're bogus
        collect_info(first); <--- here 861
        reset_outbuf(); 

---------------------------------------------------------------------- 
 vda - 06-28-08 16:13  
---------------------------------------------------------------------- 
What does it print when you add this?

        while (s) {
                put(s->label);
                s->collect(s);
bb_error_msg("s:%p s->next:%p", s, s->next);
                s = s->next;
        } 

---------------------------------------------------------------------- 
 nuclearcat - 06-28-08 16:31  
---------------------------------------------------------------------- 
Not able to trigger bug with added line

Output:
Router-Dora ~ # ./busybox_unstripped nmeter "CPU %c IO %b MEM %[mf]"
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
CPU ii........ IO    0    0 MEM 1.9g
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
CPU ii........ IO    0    0 MEM 1.9g
nmeter: s:0x80c2078 s->next:0x80c20d8
nmeter: s:0x80c20d8 s->next:0x80c2100
nmeter: s:0x80c2100 s->next:(nil)
CPU ii........ IO    0    0 MEM 1.9g 

---------------------------------------------------------------------- 
 vda - 06-28-08 16:38  
---------------------------------------------------------------------- 
We might have uninitialized ->next. I replaced xmalloc's with xzalloc's,
please try attached 3.patch 

---------------------------------------------------------------------- 
 nuclearcat - 06-28-08 16:55  
---------------------------------------------------------------------- 
Maybe some gcc optimization causing this?

diff in assembly of nmeter with line added and default
--- VAR1        2008-06-29 02:36:05.000000000 +0300
+++ VAR2        2008-06-29 02:37:24.000000000 +0300
@@ -5,26 +5,19 @@
    6:   a1 00 00 00 00          mov    0x0,%eax
                         7: R_386_32     ptr_to_globals
    b:   80 30 01                xorb   $0x1,(%eax)
-   e:   eb 24                   jmp    34 <collect_info+0x34>
+   e:   eb 14                   jmp    24 <collect_info+0x24>
   10:   8b 43 08                mov    0x8(%ebx),%eax
   13:   e8 fc ff ff ff          call   14 <collect_info+0x14>
                         14: R_386_PC32  .text.put
   18:   83 ec 0c                sub    $0xc,%esp
   1b:   53                      push   %ebx
   1c:   ff 53 04                call   *0x4(%ebx)
-  1f:   83 c4 0c                add    $0xc,%esp
-  22:   ff 33                   pushl  (%ebx)
-  24:   53                      push   %ebx
-  25:   68 0a 00 00 00          push   $0xa
-                        26: R_386_32    .rodata.str1.1
-  2a:   e8 fc ff ff ff          call   2b <collect_info+0x2b>
-                        2b: R_386_PC32  bb_error_msg
-  2f:   8b 1b                   mov    (%ebx),%ebx
-  31:   83 c4 10                add    $0x10,%esp
-  34:   85 db                   test   %ebx,%ebx
-  36:   75 d8                   jne    10 <collect_info+0x10>
-  38:   59                      pop    %ecx
-  39:   5b                      pop    %ebx
-  3a:   5b                      pop    %ebx
-  3b:   c3                      ret
+  1f:   8b 1b                   mov    (%ebx),%ebx
+  21:   83 c4 10                add    $0x10,%esp
+  24:   85 db                   test   %ebx,%ebx
+  26:   75 e8                   jne    10 <collect_info+0x10>
+  28:   59                      pop    %ecx
+  29:   5b                      pop    %ebx
+  2a:   5b                      pop    %ebx
+  2b:   c3                      ret
 Disassembly of section .text.collect_time:

If i change -Os to -O0 it works fine
in: Makefile.flags

ifneq ($(CONFIG_DEBUG),y)
CFLAGS += $(call cc-option,-Os,) <<--- here
else

 

---------------------------------------------------------------------- 
 nuclearcat - 06-28-08 16:53  
---------------------------------------------------------------------- 
Tested patch, it doesn't help.
Crashing in same place.

  LINK    busybox_unstripped
Trying libraries: crypt m
 Library crypt is needed
 Library m is needed
Final link with: crypt m
sunfire-1 busybox-1.11.0 # scp busybox_unstripped root at XXX.XXX.XXX.XXX:
root at XXX.XXX.XXX.XXX's password:
busybox_unstripped                                                        
                                                100%  565KB 565.3KB/s  
00:00
sunfire-1 busybox-1.11.0 # cat procps/nmeter.c|grep xzalloc
        SET_PTR_TO_GLOBALS(xzalloc(sizeof(G))); \
        s_stat *s = xzalloc(sizeof(s_stat));
        cpu_stat *s = xzalloc(sizeof(*s));
        s->bar = xzalloc(sz+1);
        /*s->bar[sz] = '\0'; - xzalloc did it */
        int_stat *s = xzalloc(sizeof(*s));
        ctx_stat *s = xzalloc(sizeof(*s));
        blk_stat *s = xzalloc(sizeof(*s));
        fork_stat *s = xzalloc(sizeof(*s));
        if_stat *s = xzalloc(sizeof(*s));
        mem_stat *s = xzalloc(sizeof(*s));
        swp_stat *s = xzalloc(sizeof(*s));
        fd_stat *s = xzalloc(sizeof(*s));
        time_stat *s = xzalloc(sizeof(*s));
                        /*s->next = NULL; - all initXXX funcs use xzalloc
*/
                /*s->next = NULL; - all initXXX funcs use xzalloc */

So it is applied, i made sure by rebuilding from scratch.

 

---------------------------------------------------------------------- 
 nuclearcat - 06-28-08 17:06  
---------------------------------------------------------------------- 
Also if i use gcc-3.4.5-gentoo it is ok too (with -Os even).
When bug triggered i am using 4.1.2-gentoo. I will try rebuild gcc 4.1.2
with vanilla flag, but i am feeling something new added in gcc-4
optimizations. 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
05-28-08 09:22  nuclearcat     New Issue                                    
05-28-08 09:22  nuclearcat     Status                   new => assigned     
05-28-08 09:22  nuclearcat     Assigned To               => BusyBox         
05-28-08 11:13  vda            Note Added: 0007884                          
05-28-08 12:27  nuclearcat     Note Added: 0007894                          
05-30-08 14:55  vda            Note Added: 0007904                          
06-28-08 10:11  nuclearcat     Note Added: 0008654                          
06-28-08 15:00  vda            Note Added: 0008724                          
06-28-08 16:04  nuclearcat     Note Added: 0008734                          
06-28-08 16:13  vda            Note Added: 0008744                          
06-28-08 16:31  nuclearcat     Note Added: 0008754                          
06-28-08 16:37  vda            File Added: 3.patch                          
06-28-08 16:38  vda            Note Added: 0008764                          
06-28-08 16:44  nuclearcat     Note Added: 0008774                          
06-28-08 16:50  nuclearcat     Note Added: 0008784                          
06-28-08 16:53  nuclearcat     Note Edited: 0008784                         
06-28-08 16:55  nuclearcat     Note Edited: 0008774                         
06-28-08 17:06  nuclearcat     Note Added: 0008794                          
======================================================================



More information about the busybox-cvs mailing list