'case' UTF-8 bug

Martijn Dekker martijn at inlv.org
Wed Jul 5 17:05:26 UTC 2017


Op 05-07-17 om 18:15 schreef Denys Vlasenko:
> On Wed, Jul 5, 2017 at 1:50 AM, Martijn Dekker <martijn at inlv.org> wrote:
>> Slackware 14.1 (with multilib added):
>>
>> $ /lib64/libc.so.6
>> GNU C Library (GNU libc) stable release version 2.17, by Roland McGrath et al.
> 
> So, bug triggers with this one.
> 
> And Slackware 14.2 has which libc version?

glibc 2.23.

> I reproduced it on another machine, with this libc:
> 
> $ /lib/libc-2.22.so
> GNU C Library (Gentoo 2.22-r4 p13) stable release version 2.22, by
> Roland McGrath et al.
[...]
> Thus, ash ends up calling fnmatch('cf \ 81', 'cf 81', 0).
> This normally works - superfluous backslash-escapes are simply ignored,
> and this returns a match.
> 
> I guess what happens is that in unicode locale, some versions of glibc
> do not allow backslash-escape _inside_ a unicode character.
> It probably freaks out seeing invalid unicode sequence.

That makes sense. What I find puzzling is that the newer glibc version
does allow it; it's clearly invalid. Backslash-escaping in patterns is
supposed to be applied to characters, not bytes.

- M.


More information about the busybox mailing list