ftpget triggering TCP RST at end of download

Sat Sep 24 06:10:05 UTC 2005

On Fri, 23 Sep 2005 22:01:45 -0600
Stephen Warren <swarren at wwwdotorg.org> wrote:

> I have discovered a difference between busybox's ftpget and other FTP 
> clients, such as wget and /usr/bin/ftp, in their implementation of FTP 
> "get".
> 
> Most clients seem to do this:
> 
> 1) Send RETR command
> 2) Open data connection
> 3) Read from data connection until EOF, saving data
> 4) Close data connection
> 5) Read status response from command conenction
> 
> However, busybox's ftpget (1.00 by testing, latest subversion by code 
> reading) seem to do this:
> 
> 1) Send SIZE command to determine how much to downoad
> 2) Send RETR command
> 3) Open data connection
> 4) Read from data connection exactly $SIZE bytes, saving data
> 5) Close data connection; note EOF won't be seen
> 6) Read status response from command conenction
> 
> The problem is that sometimes the kernel seems to send a TCP RST packet 
> when ftpget calls close(). I believe this is something to do with the 
> kernel thinking that ftpget hasn't read all the data from the socket, 
> and is thus sending an error message to the server that all data was not 
> reliably delivered.
> 
> This causes the FTP server (MS IIS 5.0 at least) to send back an error 
> status message, and ftpget hence exits with a non-zero status.
> 
> If I modify ftpget to use bb_copyfd_eof instead of bb_copyfd_size, then 
> everything works just fine (because the last thing ftpget does is read 
> from the socket, which returns zero indicating EOF, satisfying the 
> kernel that all data has been read)
> 
> Now, I know the server is sending exactly the number of bytes SIZE 
> returns, since if I insert an extra read call after bb_copyfd_size, then 
> no data is ever returned; zero is returned by read, indicating EOF. This 
> change also causes the TCP RST to never get sent.
> 
> So, my question is - is this an application issue; is one supposed to 
> always read until EOF is returned, or is reading the actual exact 
> number of bytes sent enough?
> 
> Without this fix (bb_copyfd_eof), then ftpget fails from 10-50% of the time.
> 
> In the packet traces I took with tcpdump (1 OK, 1 failure), I note that 
> in the failure case, all packets were delivered to the ftpget machine in 
> order without retries (at least the last few packets). However, in the 
> OK case, the last but one packet was dropped, the last packet got 
> through, and then the last but one packet was then resent and got 
> through. Is this triggering the kernel to know earlier than usual that 
> the socket was closed, and the exact byte count, hence causing things to 
> work where the otherwise wouldn't?
> 
> We're running stock kernel.org kernel 2.6.10, with a few non-networking 
> patches built for x86 (athlon optimizations) and running on what is 
> practically a standard PC.
> 
> -- 
> Stephen Warren, Software Engineer, NVIDIA, Fort Collins, CO
> swarren at wwwdotorg.org     http://www.wwwdotorg.org/pgp.html
> 

Interesting problem, it sounds like it could be fixed in ftpget, but
also i think current behaviour should work.

If the kernel (or could it be libc?) is behaving eratically with this
code, maybe other applications could have this same problem.

Have you tried a few different kernel, say a 2.4.x kernel, or a
different libc ?

Glenn