[RFC] Reorganizing the download directory.

Mon Feb 22 14:14:07 UTC 2010

On Monday 22 February 2010 04:56:48 Harald Becker wrote:
> Hi Rob!
>
> Let me throw in just an idea to the discussion about changing the
> download directory:

Ok.

>  I agree with you, that for newbies the directory shouldn't be that
> overwhelmed. So I would prefer to break things but only once (as far as
> possible). If I got it right, the biggest problems are to break the
> download scripts of users.

I dunno about "biggest".

Things like zlib didn't keep old versions up at all until I poked 'em about it 
(because they wanted to encourage people to upgrade to current).

Projects like qemu don't keep old releases around either 
(http://download.savannah.gnu.org/releases/qemu/ only seems to have back to 
0.10.0, which is fairly recent).

If you've got git you can grab old versions out of source control.  (Assuming 
they're tagged properly, which is not always the case.)

In the case of busybox, we've got a lot of people _using_ old versions, and 
historically we've always kept them available.  But the directory is getting 
cluttered and moving the older ones somewhere out of the way seems to make 
sense.  (In fact I did this back in 2006 when it was less cluttered than it is 
now, hence the "legacy" directory, which is IBM's way of saying obsolete.)

> So why don't you are going one step further
> in those script downloads?
>
> What about creating an index or link file which contains a two column
> list. The first column just gives the filename to download and the
> second column the complete URL to the file.

Because that would be a unique piece of busybox weirdness that no other 
project did, and thus one more thing for newbies to learn?  Because we'd be 
inventing our own standard when rss feeds exist?  Because it would be one more 
piece of infrastructure to maintain and remember to update whenever there was 
a release?  Because it's unnecessary complexity and a solution in search of a 
problem?

> This way you can even add
> synonyms like busybox-current to point to the current release and other
> similar usages.

You mean like the symlink we've already got for snapshots?
  http://busybox.net/downloads/busybox-snapshot.tar.bz2

The thing about actual releases is it's nice ot know when they happen, and 
it's nice to know what version you're using.  Automatically moving from one 
version to another can have side effects, it's really best that a human look at 
the transition.

The kernel's infrastructure for this predates rss feeds, they have a finger 
server (finger @kernel.org) and they have 
http://www.kernel.org/pub/linux/kernel/v2.6/LATEST-IS-2.6.32.8 which is of no 
use to anyone that I can see. :)  But most people use the main kernel.org page 
to check when releases happen, since it lists 'em.

> And additionally the link provided may even point to a
> different location/server, that is anywhere in the net (just the
> possibility, no need to overwhelm this).

I plead the 5th.

> The index file than rests in download directory as a single file, at a
> constant location, with a constant name. So just anybody has to change
> things once to work as follows:

RSS feeds are cool:

  http://www.kernelpodcast.org/feed/

They can already point to different servers:

  http://podcast.msnbc.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml

Some source control systems already produce them automatically:

  http://impactlinux.com/hg/firmware/rss-tags

Apparently, git-web is one of them (on line 6000+ of a perl script.  
*shudder*), but I have no idea how to make the busybox setup do that.  (I 
voted for mercurial.  The git ui is insane, and implemented in perl.  But I 
repeat myself...)

> What do you think about this?

I admire your enthusiasm, but I'm interested in simplifying things.  Inventing 
a new data file format standard and infrastructure to maintain it does not 
strike me as a simplification.

And adding _any_ new infrastructure won't change existing download scripts 
that have a hardwired url.  (The question is how much we care.  The answer may 
be "not all that much", but it's Denys' call.)

> Harald

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds