your mail

Fri Jan 16 03:41:55 UTC 2009

On Wednesday 14 January 2009 18:12:32 Khem Raj wrote:
> On (14/01/09 17:07), Natanael Copa wrote:
> > Hi,
> >
> > This test program segfaults for me on x86, hardened gcc-4.3.2 and
> > uclibc-0.9.30:
> >
> > #include <stdio.h>
> > #include <ctype.h>
> > int main(void) {
> > 	printf("%i\n", isalnum(0x10000));
> > 	return 0;
> > }
> >
> >
> > If the ctype.h include is commented out it works as expected.
>
> This sounds a bug to me.
> with ctype.h included it used the isalnum macro which
> then does a lookup in an array of 256 elements and the argument serves
> as array index which in your case is 0x10000 is way beyond the size of
> array.
>
> Currently if you use the valid range i.e upto 255 it will work as
> expected but beyond that it will be accessing outside the array and you
> will get random values.
>
> If you do not include ctype.h then it falls back to normal libc
> implementation which works and handles the cases beyond the ASCII range

Other fun corner case is that if char is signed your range could go from -128 
through +127.  The way glibc handled this was to make the array 384 entries 
long, and here's a cut and paste from /usr/include/ctype.h from glibc 
explaining why they did that:

/*
   These point into arrays of 384, so they can be indexed by any `unsigned
   char' value [0,255]; by EOF (-1); or by any `signed char' value
   [-128,-1).  ISO C requires that the ctype functions work for `unsigned
   char' values and for EOF; we also support negative `signed char' values
   for broken old programs.  The case conversion arrays are of `int's
   rather than `unsigned char's because tolower (EOF) must be EOF, which
   doesn't fit into an `unsigned char'.  But today more important is that
   the arrays are also used for multi-byte character sets.  */

Whether or not it's a good solution I'll stay out of. :)

Rob