[Toybox] patch: add built-in versions of sha-2 family hash functions

Rob Landley rob at landley.net
Tue Jun 15 13:18:08 UTC 2021

On 6/15/21 2:23 AM, Denys Vlasenko wrote:
> On Tue, Jun 15, 2021 at 2:30 AM enh <enh at google.com> wrote:
>> i haven't, no. i don't know anyone who works on coreutils,
> They probably have a mailing list.

And paperwork you have to physically sign and mail in to assign copyrights to
them before they'll ever take a patch from you.

I've previously expressed a reluctance to get the FSF on me, and a general
refusal to author GPLv3 code in a hobbyist context (not just "unpaid" but
"without a fortune 500 legal department standing between me and copyleft
trolls". Yes there are copyleft trolls now: ignoring the FSF's mepis suit and
what the SFLC did with busybox here's a guy who made $4 million so far from
creative commons https://twitter.com/cstross/status/1404406401645088768 .)

License hygiene is why I haven't looked at busybox code in a while, and when it
does come up I look at 1.2.2 first because I don't want to be accused of copying
GPL code into a 0BSD licensed project. (We can probably all assume I've already
seen everything in the tarballs I personally prepared and released.)

The pending follow-up question to my previous email beyond "do you want this
feature" was "are you willing to add it yourself, or do you want me to
refamiliarize myself with busybox cut.c and whatever libbb sharp edges I hit
enough to knock up a patch". I felt the odds of you implementing it yourself
were higher if I DIDN'T offer a patch in the email, but if the busybox project
only wants the feature if I provide a patch, yes I can provide that patch.

Coreutils wants me to read GPLv3 code, write GPLv3 code, and assign copyrights
to them. Coreutils also has a history of replying "patches welcome" to any
suggested new feature. If I approach them and get told "we will not do this
until you author a patch and assign its copyright to us on paper with this form"
does mean that if the feature does become popular elsewhere and is suggested to
them, they'll say "no, THIS guy is obligated to do it" as their excuse NOT to?

(You may remember I had a dim view of the FSF _before_ the Second Coming of RMS.
Now that they've warmly embraced his post-#metoo self back into their ranks, I
don't think I'm cynical ENOUGH about them.)

>> and assumed it would be easier if both busybox
>> and toybox already supported the same syntax.
> Yeah...
> What I worry about is gratuitous divergences we create.

Hence us asking busybox if they wanted the new options, yes.

> E.g. _every_ clone_ of netcat seems to "extend"
> and "improve" on the original nc version 1.10

Because "hobbit" never maintained the March 1996 abandonware release, which thus
never adapted to anything new. Everybody starts with bugfixes and works their
way to new features.

Me, when I first heard about netcat ("Is there a tool to cat a file through the
network?" "Yeah, netcat.") I couldn't find a copy, so I wrote my own from
scratch back in 2001: http://dvpn.sourceforge.net/old/netcat.c (only 5 years
after Hobbit's version).

Hobbit is still around by the way, http://techno-fandom.org/~hobbit/ has SF
convention reports from this year, and the same email address as the 1996 README.

> in non-compatible ways. With the natural result of people
> not really using nc in their scripts when they need them
> to work _the same_ on any Linux distro.

Do you have an example _other_ than netcat? Because I thought you personally
were the user motivating that? You said:


> nc_bloaty.c was added because I need nc which is compatible with
> original nc-1.10... our nc was lacking a few things nc-1.10 had
> (don't remember off-hand which things).
> Every single fork of nc (there are at least four)
> grew tons of incompatible extensions, which is a PITA for users.
> I do not want to follow them...

So rather than add missing features (or fix incompatibilities) in the
implementation busybox already had, you added a second implementation (alongside
the other one, without removing it), with no additions to the test suite or
other documentation of the use cases motivating its addition, reasons you had
already forgotten by the time I noticed and asked ~3 years later. And in this
case you're NOT tracking gnu netcat (I assume that's one of the forks you
complained about), you're pointing at the 1996 version as immutable. (I'm
guessing https://sourceforge.net/p/nc110/git/ci/master/tree/netcat.c is another
one of the forks you disliked? I don't know if you count the one I wrote for
busybox or openbsd's as fresh implementations rather than forks...)

Since then you've added features (git a16c8ef21253) but trying to figure out
where they came from in your "upstream" debian-traditional...
http://deb.debian.org/debian/pool/main/n/netcat/netcat_1.10-46.dsc points to
http://www.stearns.org/nc/ as the homepage which was last updated in 2003, so
that's not it. Grabbing the debian source tarball and extracting the tar.xz:

$ ls patches
545579-send-crlf.patch         no-sleep-punt.patch  sh-c.patch
655881-Makefile-LDFLAGS.patch  no-static.patch      single-verbose.patch
655881-netcat.c.format.patch   posix-setjmp.patch   so-keepalive.patch
arm-timer.patch                proxy-doc.patch      tos.patch
dash-port.patch                quit.patch           udp-broadcast.patch
glibc-resolv-h.patch           read-overflow.patch  unstripped.patch
help-exit-failure.patch        rservice-buf.patch   use-getservbyport.patch
inet-aton.patch                select-nfds.patch
nodup-stderr.patch             series

That seems as big a fork as any of the others?

After a certain point, a standard is a thing you help _establish_.

> This is excusable for a 20 year old CS student who
> did not yet step on landmines of that sort.
> We should know better.

The model for my busybox development was always substitute in the next busybox
command into the Linux From Scratch build $PATH and see what breaks (carefully
comparing the console output, build artifacts, and resulting binary behavior
against what the old version had done).

You get a real world test case that already exists out in the wild, you change
the tool's behavior to satisfy the test case, rinse repeat.

That's... not what happened with netcat.

> Compatibility is important.

As far as I know that was our entire motivation for wondering if somebody else
also wanted do the thing we've been doing since 2017, yes.

I don't pay attention to gplv3 code, and am assuming coreutils will all be
merged into systemd soon enough anyway, but when I taught busybox mount to
autodetect "mount file.img dir" many moons ago (without having to say -o loop) I
didn't inform the gnu guys about it yet they picked it up a couple years later

> Insufficient compat was holding back Unix-like OSes
> for decades.

It was a little more complicated than that.

Unix predated the 1983 Apple vs Franklin decision extending copyright to cover
binaries, after which AT&T allowed itself to be broken up to escape the 1956
sherman antitrust consent decree and thus commercialize technologies it owned
like Unix.


So after the January 1973 ACM article publicized Unix
(https://dl.acm.org/doi/10.1145/800009.808045) it had a full decade of de-facto
open source status, including Ken Thompson's students from his 1975 sabbatical
from Bell Labs to teach two semesters of OS design at berkeley (which gave rise
to BSD). In 1979 Bill Joy participated in the 10 year anniversary contract where
DARPA was replacing the original honeywell internet IMP systems with Vax
machines, and sniped the network stack away from Bolt Beraneck and Newman by
doing a much more efficient implementation on a lark:


And that spread BSD unix throughout the world: over the next couple years they
replaced all the internet message processors with BSD routers, and when they
switched to IPv4 in 1983:


Everything was running BSD unix. But after the franklin decision and the AT&T
breakup, AT&T clamped down and suddenly started seriously commercializing their
Unix IP, and THAT is what fragmented the market.

AT&T went around to each unix vendor and "convinced" them (with many lawyers) to
replace the BSD base in their systems with SystemV: sunos->solaris and aos->aix
and so on, "Under the Radar" by Robert Young and Wendy Rohm covered that in some
detail. THAT is where the fragmentation came from, and the closed proprietary
drive to differentiate.

At about the same time the unix vendors also picked the wrong side of the
risc-vs-x86 war of the 1980s:


So by the time the Jolitzes did the 386 BSD port circa 1990 (delayed by that
whole BSDi lawsuit), BSD was already the unix underdog:


The Unix variants that DID target PC hardware were also on the wrong side of
Microsoft's "CPU tax" on commodity x86 hardware, and got hit with the same
monopoly leverage issues that took down OS/2 and BeOS:


The CPU tax stuff is what the 1995 antitrust trial under Judge Sporkin was about
(seperate from the 1998 antitrust trial that started over the browser stuff
under Judge Jackson).

Microsoft was successful against other hardware targets because the hardware
fought all their battles for them: the x86 price/performance ratio outcompeted
sparc and alpha and hp-ux and so on (largely because chip price is a function of
unit volume: huge up-front costs amortized over a large production run, so he
who has the highest volume wins), and they locked their software to that
hardware with predatory contracts on the manufacturers.

(Sigh, there's so much good material on this I haven't looked at in ages. Peter
Salus "a quarter century of unix", and there's a history of the internet called
"where wizards stay up late". I used to do writeups on this stuff a lot longer
and more detailed than this, ala https://landley.livejournal.com/14310.html

Busy with other things these days...


P.S. Posix was hamstrung by vendors trying to game the FIPS 151-2 federal
procurement standard, and STILL hasn't recovered:

More information about the busybox mailing list