You can't spell "evil" without "vi".

Rob Landley rob at landley.net
Fri Oct 17 11:04:35 UTC 2008


On Thursday 16 October 2008 10:16:19 Ralf Friedl wrote:
> I had the impression that original problem was when an ESC-sequence
> crossed the input buffer, not that it had something to do with the
> timeout after the ESC.
> So is this whole discussion now about a different, but related problem?

The problem was that escape sequences were being broken up and treated as 
individual characters.  My original theory as to _why_ was that the buffer 
was filling up and that overflow was causing the sequences to become 
decoupled, but when I rewrote readit() so that it only read one escape 
sequence at a time (and thus the buffer wrapping wasn't an issue), that 
didn't fix the problem.  (It seems to have made it occur slightly less often, 
so that might have been _another_ way to trigger the problem, but it clearly 
wasn't the only one.)

The second problem turned out to be that transmission through a serial port 
(even the virtual serial port in an emulator) was potentially inserting a 
delay between each character, which decoupled the escape sequences.  (The 
sequences get written as a chunk, and if they go through a pipe or a network 
socket that means they get _read_ as a chunk.  We were depending on that, if 
we did a nonblocking read and _didn't_ get more data, we considered the 
sequence to be over.  The problem is that over a serial port, a 3-byte 
write() gets A) get broken down into individual character, B) real hardware 
inserts a delay between character.  So if you loop doing blocking reads 
they'll mostly return length 1 unless you pause long enough for more data to 
arrive, and if you loop doing non-blocking reads many of them will return 
before the next character can come in.)

We never noticed because it doesn't behave like that over a local TTY or a 
network connection with a 1500 byte MTU.  A 3 byte write() turns into a 3 
byte read() at the far end.  Serial ports don't work that way.

> Also, I don't understand the problem with poll potentially waiting
> longer that specified under heavy load. If poll really waits longer, it
> is more likely that additional data has arrived in this time, not less
> likely.

Yeah, as far as I can tell Denys is worrying about a problem that can't 
actually happen.  The process scheduler won't delay the in-kernel response to 
new data arriving via a serial interrupt, so effectively all the scheduler 
could do is _extend_ the poll timeout by not scheduling the process promptly.  
Denys is making the delay longer because the scheduler might... _also_ make 
it longer.  Seems a bit unnecessary, somehow.

That said, it doesn't actually hurt anything.  It's merely useless (and the 
comment he added is inaccurate), but it's still better than the 300ms I 
started with (and proved I didn't need once I worked out the right tests to 
understand the actual system behavior).

Rob



More information about the busybox mailing list