bb wget typo and problem with @ in URL

Raphaël HUCK raphael.huck at efixo.com
Fri Oct 6 15:53:03 UTC 2006


I've just checked the latest revision of wget.c in the svn.
There are 2 problems in function "parse_url":

* on line 510, "tfp" should be "tcp" instead

* when it searches for @ in the URL, it doesn't stop at the host, if the 
host does not end with a "/", so if an @ is not url-encoded in the GET 
parameters of the URL, it will the everything preceding it is a username 
(this works fine in original wget).


Example in bb wget:

$ wget 'http://www.example.com?login=john@doe'
wget: doe: Unknown host

Example in original wget:

$ wget 'http://www.example.com?login=john@doe'
--17:04:01--  http://www.example.com/?login=john@doe
            => `index.html?login=john at doe'


According to RFC1738: An empty abs_path is equivalent to an abs_path of "/".

So the following URL should be allowed: 
http://www.example.com?login=john@doe

It also says: within the user and password field, any ":", "@", or "/" 
must be encoded.

But how are we supposed to find the end of the host, if "?", ";" and "#" 
are allowed not to be encoded in the username and password ?


Little sidenote: it should not make any difference if you do a:

	strrchr(h->host, '@');

or a:

	strchr(h->host, '@');


And another small detail:

line 517, the check is:

	if (sp) {

and line 524:

	if (up != NULL) {

Do you prefer writing "!= NULL" or leaving it out?


--Raphael HUCK



More information about the busybox mailing list