[PATCH] implement futimes

Mon Nov 23 10:12:29 UTC 2009

On Sunday 22 November 2009 21:19:47 Austin Foxley wrote:
> On 11/22/2009 02:21 PM, Rob Landley wrote:
> > Instead, like the proverbial cutting wedge whose edge splits into two
> > points, foreward progress goes to zero.
>
> As a relative newcomer to this project, thanks for the history lesson.

Computer history's a hobby of mine even for projects I _haven't_ been 
following for a decade...

And thanks for your work on the Sparc target.  (I need to get back to testing 
that, it's just wandered a half-dozen items down on my todo list.  Possibly 
todo stack?)

> First step towards sanity: Let me commit the nptl_merge branch to
> master. If you haven't looked at it, I've got all the relevant changes
> for nptl grouped into related commits, and it's up to date with master
> as of today.

Yay!

When this is available as a patch series, I'd love to read it.  (But don't let 
that stop it from going _in_.  It can always be reviewed after the fact.  
We've got the complete history, getting it in will mean there's exactly one 
tree we expect people to _test_.  Splitting your tester base is disastrous, 
splitting what you expect people to test makes the effectiveness of the test 
effort go down exponentially.  There've been studies on this, it's one of them 
network effects things...)

> Second step: I'm fairly sure Bernhard is planning to release a 0.9.30.2
> soon. There are a lot of important fixes that have already made it in to
> that.

I'm all for bugfix-only releases, but they're mostly orthogonal to development 
stabilization releases.

I say "mostly" because without bugfix-only releases you have the "no release is 
ever stable" problem the Linux kernel had for the longest time.  They're 
important because they take patch pressure off the -dev branch, and ironically 
allow it to cycle faster.

--- You can stop reading here (the rest is all history and release management 
theory).

The current Linux "stable series" four level dot-release stuff started in March 
2005, with Linus proposing an even/odd numbering scheme that was just totally 
unworkable:

  http://lkml.indiana.edu/hypermail/linux/kernel/0503.0/0512.html

And then an ENORMOUS thread ensued (which is well worth reading, if you have a 
spare weekend).

Eventually, Linus proposed something workable:

  http://lkml.indiana.edu/hypermail/linux/kernel/0503.0/0910.html

And he got a volunteer:

  http://lkml.indiana.edu/hypermail/linux/kernel/0503.0/0918.html

Luckily, back then kernel-traffic was still in operation and did a _heroic_ job 
summarizing the thread.  Here is why the kernel guys did what they did:

  http://kerneltraffic.osmirror.nl/kernel-traffic/kt20050403_303.html#2

And some follow up:

  http://kerneltraffic.osmirror.nl/kernel-traffic/kt20050402_302.html#8
  http://kerneltraffic.osmirror.nl/kernel-traffic/kt20050402_302.html#11
  http://lkml.indiana.edu/hypermail/linux/kernel/0512.0/0634.html
  http://kerneltraffic.osmirror.nl/kernel-traffic/kt20050403_303.html#9
  http://kerneltraffic.osmirror.nl/kernel-traffic/kt20050612_315.html#5

One little subtlety: people who are fiddling with older versions and want to 
backport fixes actually need one _more_ bugfix release than you'd think, because 
bugs fixed between the last dot-release and the new -dev stabilization were 
getting dropped on the floor.  There was a hole in the kernel -stable series 
for the first few months, which got closed here:
  http://lkml.indiana.edu/hypermail/linux/kernel/0512.0/1327.html

It occurs to me that I've learned a huge amount about release management over 
the years from reading this kind of stuff... and it's probably never been 
written up coherently anywhere.  The open source community's just _evolved_ a 
set of best practices which you pick up via osmosis.

For example, the actual effect of distributed source control on open source 
development is to make short-lived branches cheaper, by allowing patch series 
to be automatically migrated from tree to tree in large batches and in an 
arbitrary order, rather than one at a time in a specific order.  This means you 
can let your local tree diverge for a short development cycle and then flush 
the whole mess back into the original tree fairly easily, without worrying 
about non-structural conflicts with whatever _else_ they've done to the tree in 
the meantime.  (It sets up a sort of funky quantum entanglement between the 
two that allows patches to jump back and forth easily.)

This allows better delegation, and _nested_ delegation (ala the Linux kernel's 
"one architect, a dozen lieutenants, hundreds of maintainers, thousands of 
developers" setup, where each level owes the previous one a response and thus 
patches don't get dropped on the floor), which means development can scale to 
larger numbers of developers allowing larger project sizes.

Alas, if two trees don't start from a common base and then take care to 
maintain the correspondence, it fades with time.  Thus a three year fork ain't 
something porting your svn to git is going to fix.  It helps prevent you from 
getting _into_ the unmergeable mess situation, not cleaning up once you're 
there...

How that interacts with bugfix-only releases, and time based releases, and 
"feature freeze" vs "merge window" approaches, and the whole "oral tradition" 
aspect of development communities (you have to scale your development 
community with your code size, which is why big corporate releases of existing 
code bases tend to land with a "splat"...)

Darn it, I need to write documentation again.  Yet another todo item.  DO NOT 
HAVE TIME RIGHT NOW...

Anyway, yay having a bugfix release, but that doesn't stop -dev from becoming 
irrelevant.  It just means people do new development against your -stable 
series and then try to get patches merged into the one they _use_.  Bugfix 
release series that _aren't_ extremely short-lived and ruthless about _not_ 
accepting new features wind up turning into forks.

The reason the Linux kernel changed its development model in 2.6 was because 
of the failed IDE rewrite in 2.5, which had to be reverted:

  http://lwn.net/Articles/7789/

Basically, for about a _year_ the IDE subsystem was completely unstable in 
2.5, so nobody who had IDE drives (about 90% of the developer base) could run 
those kernels, so everyody who had to do any _other_ development did it 
against 2.4.  Remember Dave Jones' full time job porting 2.4 code to 2.5?  Tip 
of the iceberg, most of the patches were never even merged into 2.4.  There 
was an epidemic of out-of-tree code.  Here's a list of a few I collected 
during the cleanup phase: http://lwn.net/Articles/14188/

Thus 2.5 development stalled horribly, and a huge amount of unmerged code 
built up as unmerged out-of-tree patches against _2.4_ (not 2.5), and Alan Cox 
merged most of it in his -ac tree which became the de-facto standard tree (to 
the point Red Hat based their kernels on the -ac tree, not Linus's), and Linus 
got overwhelmed and instituted the Lieutenant system without _telling_ 
anybody, and by the time I brought it to a head with the Patch Penguin 
Proposal Alan Cox realized he was on the verge of rendering Linus 
"irrelevant", and he didn't want to do that (considering Linus the better 
_architect_, patch flow and integration issues and the IDE cul-de-sac aside), 
Alan ended the -ac tree and took a year off to get an MBA to force everybody to 
go back to Linus and fix the problems with the development series.

So once they switched to distributed source control, cleaned up the mess, and 
got 2.6 out, Linus Torvalds basically went "never again" and switched the 
kernel development model to what we have today.

So if you think uClibc's got development problems, the Linux kernel 
development process has nearly melted down on something like three separate 
occasions.  (The previous one was circa the 2.0 release when Linus almost had 
a nervous breakdown and Maddog forced him to delegate to maintainers in the 
first place; before my time really.)  They keep needing to _invent_ new ways to 
make open source development scale, and then everybody else copies them 
because they've gotten darn good at it over the years....

What they learned from 2.5 is that most people don't test stuff they can't use, 
and nobody's going to sit on a patch for a year while you get your act 
together.  They're going to develop against what they're _using_, and 
generally that's the stable tree.  So if your "development branch" diverges 
too far from the stable tree, most developers they won't bother to port it to 
something they can't use that's just going to change out from under them and 
break it again anyway.  Instead they'll maintain it against the stable branch 
(which doesn't change so much), and even push to get it merged into -stable 
and ignore your development branch entirely...

Until your -dev branch has a feature freeze in preparation of becoming the 
next -stable branch.  As soon as you announce a feature freeze, everybody will 
rush to forward-port their patches and you'll get a flood of new feature 
submissions right before deadline, meaning attempts to stabilize the tree wind 
up _destabilizing_ it.  Notice the above link to the "crunch time" series was 
in October 2002, but 2.6.0 didn't come out for over a year after that.  At the 
time I referred to this as the freeze/thaw/slush/slurpee cycle...

  http://lkml.indiana.edu/hypermail/linux/kernel/0207.0/0391.html

And of course nobody's going to accept "you missed the deadline" if it's going 
to be a year or more until the next release, now that they've got the patch 
ported to -dev they don't want to maintain it out of tree through a whole 
'nother development cycle and forward port it again.  No, they want it to go 
in during the -stable series, rendering the idea of "stable" kind of silly....

Time based releases avoid going there.  It means "it's ok to miss this 
release, there'll be another one in 3 months".  It's not the end of the world 
to wait _that_ long.  It also means that the port from stable to -dev is never 
_too_ painful.  (Just accept the fact most new features are developed against 
-stable, not against -dev.  This will never change, and we must cope.  You 
can't do new development against an unstable base, your own new code is buggy 
enough, adding other people's random breakage and daily regressions to the mix 
is just unworkable unless you're doing no new development and _just_ fixing 
bugs...)

I'll stop now.  I could probably write a book on this crud...

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds