[BusyBox 0000650]: ash won't process non-ascii characters correctly, starting with v1.1.0

bugs at busybox.net bugs at busybox.net
Fri Jan 20 09:25:29 UTC 2006


A NOTE has been added to this issue. 
====================================================================== 
http://busybox.net/bugs/view.php?id=650 
====================================================================== 
Reported By:                jemelja
Assigned To:                BusyBox
====================================================================== 
Project:                    BusyBox
Issue ID:                   650
Category:                   Other
Reproducibility:            always
Severity:                   minor
Priority:                   normal
Status:                     assigned
====================================================================== 
Date Submitted:             01-19-2006 15:41 PST
Last Modified:              01-20-2006 01:25 PST
====================================================================== 
Summary:                    ash won't process non-ascii characters correctly,
starting with v1.1.0
Description: 
Hello,

`ash' behaves differently in busybox v1.1.0 and v1.01. It can't
process non-ascii characters (like iso-8859-1) anymore, for example

    # echo "Ä"  # `A' umlaut

gives

    -sh: Syntax error: Unterminated quoted string

instead of just `Ä'. The same example, but unquoted, makes the shell
crash.

After looking into it, I found that the reason for this new behaviour
is the new -funsigned-char switch to gcc (since 2005-12-01), and the
way character values are used as array indexes.

In shell/ash.c characters are read from a shell script using
`parsenextc', a pointer to `char'. Each character value will be added
to a SYNBASE of 130, then used as an index to a 258-byte
array. Non-ascii characters, with negative `char' values would result
in indexes less than 130.

Because `char' now is unsigned on default, non-ascii characters will
be greater than 127, resulting in indexes beyond the end of the array.

The attached patch is a simple fix, it defines the signedness of the
characters being parsed near the position where their values are
interpreted. But perhaps it would be better to define `parsenextc' and
`buf' (and corresponding struct parsefile entries) as `signed char',
rather than `char'.

Regards,
Michael

====================================================================== 

---------------------------------------------------------------------- 
 vodz - 01-20-06 01:25  
---------------------------------------------------------------------- 
This problem destroyed already in SVN 13421.
And your patch raptially only. 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
01-19-06 15:41  jemelja        New Issue                                    
01-19-06 15:41  jemelja        Status                   new => assigned     
01-19-06 15:41  jemelja        Assigned To               => BusyBox         
01-19-06 15:41  jemelja        File Added: shell_ash_signedchar.patch           
        
01-20-06 01:25  vodz           Note Added: 0000958                          
======================================================================




More information about the busybox-cvs mailing list