[Buildroot] [RFC PATCH v1 1/6] package/go: implement go modules integration

Christian Stewart christian at paral.in
Sat Apr 6 03:13:47 UTC 2019


Hi Arnout,

Arnout Vandecappelle <arnout at mind.be> writes:
>> It's not really a good idea to use the auto conversion at runtime since
>> the result is not as deterministic as having a predetetermined go.mod,
>> and the conversion requires network lookups.
>
>  AFAIU, packages which use vendor.conf or gopkg.toml will have their vendor
> trees bundled, no? Otherwise other users wouldn't have a reliable way of
> downloading the package anyway, right?

No. Not necessarily. Especially in the case of Gopkg.toml.

However, in any case where it is bundled, we can just use it - and this
is the current behavior of this PoC series.

>> Pretend the other formats don't exist. They are there for legacy only anyway.
>
> Okay. Anyway, the go.mod will be autogenerated since it will be
> missing, and supposedly the vendor tree is there.

No, the tool will not know what the path to the root module is.

>> I suppose you're saying you want to add more download statements in the
>> package and then download all of the dependencies into the vendor/ tree
>> using the Buildroot mechanism.
>
>  Not really download statements; _EXTRA_DOWNLOADS, and a post-extract hook to
> extract them. In terms of complexity, it's pretty similar as carrying the go.mod
> in Buildroot. But of course, this was would not be able to use an existing
> upstream go.mod.

It's not similar at all to the complexity of carrying go.mod in
Buildroot. The Go tool manages an immensely complex process of resolving
the import paths to download URLs, fetching the sparse commits from the
source repositories, hashing the code for consistency, and linking it
all together so that the imports resolve correctly.

This is particularly complex with aliased imports in go.mod, which
Kubernetes uses for example.

Carrying the go.mod and go.sum allows the Go tool to manage all of this
for us while still keeping the code source hashes in a file similar to
my-package.hash.

>  Why is it not viable?

The maintenance process for carrying go.mod and go.sum is as follows:

 1. Clone the project and checkout the correct revision in GOPATH.
 2. Run "go mod init"
 3. Copy go.mod and go.sum to Buildroot.

That's it.

The maintenence process for what you're describing is, I'd imagine:

 1. Clone the project and checkout the correct revision
 2. Read through the dependencies and determine the versions of all.
 3. Write the download paths for the code into Buildroot BY HAND.
 4. Download all of the sources and hash them BY HAND.

Nobody is going to do this.

Perhaps the misunderstanding here is in how the go.mod and go.sum files
are managed. It's almost completely autonomous by the Go tool. We could
even write a command in Buildroot that generates them for us. If you're
imagining that I wrote all of these for docker-cli by hand, no wonder
you're alarmed at the prospect of including them in the Buildroot tree.

>  Because we have the option of either using the existing mechanisms, or to
> extend the infra to cover go modules. To evaluate whether the extra infra is
> useful, it is interesting to look at what it would look at what would be needed
> without it.

That's valid, and I believe it's an intractable approach, due to the
immense development overhead as described above.

>> GOPROXY=direct go mod vendor, or you might try executing the build with
>> the series applied, compiling any Go package, and you'll see the
>> download happen during that step.
>
>  I'll admit that I hadn't done that yet. So I applied only the first patch and
> tried to build docker-cli. It doesn't do any downloading of dependencies because
> the vendor/ tree is already there (the patch has an explicit condition for that).
>
>  BTW, the build step then does automatic conversion of vendor.conf, but the
> build fails with:
>
> can't load package: package cmd/docker: cannot find package "." in:
>         /home/arnout/src/buildroot/output/host/lib/go/src/cmd/docker
>
>  I'm probably doing something wrong :-)

Docker CLI requires a patch because they use some non-standard logic for
wiring together some adjusted vendored dependencies.

>  I just noticed something missing: you'll need to add host-go to the
> DOWNLOAD_DEPENDENCIES of go packages if you want to use the go tool in the
> download step.

Noted, thanks.

>  Agreed.
>
>  I'm not communicating my message very well apparently. I am not saying that a
> Buildroot-supplied go.mod is not useful. I'm only saying that it is not
> absolutely needed, and that the current infra probably already covers a lot if
> not all of the use cases we have in reality. While explaining why I think that,
> I apparently didn't make it clear that I'm only trying to compare the different
> options we have, and weigh that against the priorities of the project.

It is necessary in some cases when the package is using some "hack"
scripts in their tree to wire together dependencies at build time in a
non-standard way. The go.mod format allows us to codify these
adjustments in a way that the Go tool understands natively.

See docker-cli. I included these patches not to alarm you guys with
superfulous hash files, but to show examples of where these adjustments
allow us to fix quirks and simplify package build processes as well as
improve build consistency and reproducability.

>>>   b. Upstream has no go.mod, but has vendor.conf or .toml -> nothing needs to be
>>> done, the Go tool handles these as well.
>
>  I think my analysis was wrong. It should have been:
>
> a. Upstream has a go.mod.
>  -> Dependencies have to be downloaded. Handled by the current patch, with the 3
> limitations mentioned.

... unless they include vendor/ in the tree with go.mod, which sometimes
happens, and is indeed already handled by the PoC patch series (which
should be able to handle ANY condition as written with the quirks I
listed before).

> b. Upstream has a bundled vendor/ tree (as part of the tarball or as git
> submodules).
>  -> go.mod has to be created with just the module name. Handled by the current
> patch, no limitations.

Yes.

> c. Upstream doesn't have a bundled vendor/ tree, only vendor.conf or gopkg.toml.
>  -> not reproducible because the vendor.conf and gopkg.toml don't encode the
> version to download. (Note: I looked at the vendor.conf of docker-cli and it
> does have a version, so I'm not sure if this is true).
>
>  For the c case, I indeed see no other way than a Buildroot-provided go.mod that
> I can think of.
>
> Could you maybe explain why it is not deterministic?

They encode versions, but unfortunately they don't encode the transient
dependency versions (in all cases).

>  The network calls are the same as in case a I think, no?

No, go.mod and go.sum encode all the dependency versions, and also store
hashes of the code as well, so it's guaranteed to produce an identical result.

> Note that I don't think we need go.sum. We do with the current patch,
> but if the go download becomes a download method, we end up with a
> tarball that contains all dependencies, and the buildroot hashing is
> done on that tarball. So checksumming is done (though it's a little
> late, so finding out what went wrong in case of hash mismatch might be
> tricky).

Maybe so, but the Go tool would be quite happy to do the summing for us.

Please note that this file is NOT hand written.

> That was my point: all current packages have a vendor tree, so it never gets
> called.

Because it is not currently possible to add a package without a vendor tree.

The fact that all packages work one way today doesn't factor into the
optimal solution, at least in my mind, aside from backwards
compatiblity, which we do have here.

Note Docker required specific touchups with the GOPATH approach before.

The reason why the build failed when you tried it with just the first
patch, is because previously the docker-cli package overrided the
WORKSPACE variables to set the Go package path. The Go package path is
automatically inferred to be "github.com/docker/cli" instead of the
correct "github.com/docker/docker" with just the first patch and not the
docker-cli touchup patch.

>> [GOPROXY=off] is strictly needed. The Go tool will analyze the code
>> and determine the import paths of all the code files, and then
>> generate / update go.mod. You might want to run the patch series and
>> observe the behavior yourself.
>
> The behaviour I observe (with only patch 1 and removing GOPROXY=off)
> is that there is a single-line go.mod (generated by Buildroot) and no
> go.sum, and no downloads are done. Which makes sense, because the
> vendor tree is already there.

Yes, this is the correct behavior when vendor/ exists, and there is no
go.mod and go.sum in the Buildroot tree to override it. If there is a
go.mod and go.sum in the Buildroot tree, the PoC patch deletes vendor/
and replaces it via "go mod vendor."

>  Note that I'm not saying that GOPROXY should not be set to off. What
> I'm saying is that that is pretty much independent of the rest of the
> go module support.

If you just want Go module support without go.mod and go.sum and any
network fetching, then you have to change the following two environment
variables:

 - GO111MODULE: off -> on
 - GOPROXY: [none] (defaults to "direct") -> "off"

When we turn GO111MODULE to on, we imply GOPROXY=direct. This will cause
network calls during the "build" call.

>> Maybe, but you're making a lot of absolute statements here without
>> actually using the Go modules workflow yourself to know why those
>> statements are or aren't true.
>
>  I don't intend to make absolute statements, but I can hardly write AFAIU on
> every line :-)

I've provided a lot of examples of where overriding go.mod is useful:

 - When we add a download provider, we need an opportunity to adjust
   go.mod before the download step, which is far before the PATCH step.
 - Older Buildroot versions will use older releases of packages.
   Developers do not typically update the vendor/ tree of old releases.
 - Some packages use weird monkey-patching of GOPATH in their build
   scripts, which can be replicated with a single go.mod file instead.

I feel like it's not that big of a deal to copy go.mod and go.sum from
the package directory in Buildroot if it exists, and then agree to not
commit these files to the mainline Buildroot tree. The "copy them in"
step would be used by external packages, then. Even then, it's not that
big of a deal, I'm happy to remove the "copy these files in" line, so
this feels a bit like a garden shed :)

Best,
Christian



More information about the buildroot mailing list