Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid hardcoding toolchains in build tools / compilers #185742

Open
uri-canva opened this issue Aug 8, 2022 · 15 comments
Open

Avoid hardcoding toolchains in build tools / compilers #185742

uri-canva opened this issue Aug 8, 2022 · 15 comments

Comments

@uri-canva
Copy link
Contributor

Users of nixpkgs can use packages in nixpkgs in a couple of different ways:

  1. within a derivation set, like nixpkgs itself, as an input of another derivation.
  2. in a nix environment, like a nix shell or nixOS system, but outside of a derivation set.
  3. in a non-nix environment, like a host OS using nixpkgs as a package manager.

For some programs supporting all 3 is simple enough, all the inputs to the program are passed in at runtime via command line flags, configuration and stdin, so they can operate in the exact same way.

For build tools and compilers however, the amount and complexity of implicit inputs read from the environment / built in configuration is such that they cannot run in the same way, and different versions are created (for example clang and clang-unwrapped, bazel_* with and without enableNixHacks). This doesn't always happen because build tools and compilers in nixpkgs are mostly used within a derivation set (use case 1), so sometimes they're built in ways that can only support that use case, or other use cases are less exercised and break more easily.

As a consequence of that, some build tools and compilers hardcode some of their inputs on the assumption that they're going to be used within the package set that they were built in. For example bazel hardcodes the python interpreter, shell and more, cmake hardcodes the libc it builds with.

I think this is something we should avoid: not only does it make suppporting use cases 2 and 3 much harder, but it also means that if you want to change those inputs in use case 1, you have to rebuild the build tool / compiler, which can take a very long time since they're usually very big programs, and have complex builds often involving bootstrapping.

@uri-canva
Copy link
Contributor Author

cc @stephank for your work in #181431.

@uri-canva
Copy link
Contributor Author

@NixOS/bazel I plan on making changes to bazel based on this, to reduce the amount of hardcoded inputs that get passed into toolchains. In particular for bazel, hardcoding inputs makes it really really hard to ensure all the toolchains are configured in a way that bazel is fully aware of the inputs, which leads to a lot of pain when trying to set up remote execution.

@uri-canva
Copy link
Contributor Author

ccing cmake maintainers: @ttuegel @LnL7 @AndersonTorres

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/avoid-hardcoding-toolchains-in-build-tools-compilers/20849/1

@stephank
Copy link
Contributor

stephank commented Aug 9, 2022

You mention in #181431 (comment) you'd like a detection mechanism not specific to Nix, but I'm not sure what that would look like. For example, in the cmake case, upstream hardcodes FHS, so anything we do is going to be Nix-specific, I think. Do you maybe have an example of something that does things 'right' in this case? (Or do we just need upstream support?)

@uri-canva
Copy link
Contributor Author

Oh yeah, sorry I'm not familiar with cmake, but yes, in some cases you'd need upstream support, for example in bazel upstream hardcodes paths to the default macOS toolchain so I sent a patch to look for it in PATH instead: bazelbuild/bazel#16010, if upstream supports the common CC, LD etc environment variables then making it not specific to Nix might be looking for NIX_CC but then falling back to CC for example.

@AndersonTorres
Copy link
Member

What about to talk about it with some of those upstream developers? It is not uncommon at all, Meson developers and even some NetBSD developers are aware of Nix ecosystem.

@stephank
Copy link
Contributor

Finding non-Nix-specific ways for all cases is going to be difficult. We can try set a standard in some cases maybe, but that may also make it less convincing for upstream to follow.

For example, PATH is limited in use, and in the case of #181431, what we're trying to locate is libc (not really in PATH). I'm also worried that the granularity with which we split up toolchains is inherently Nix-specific. LLVM is split up in a dozen derivations or so, all installed in different locations.

(Still think this is a good idea to pursue, though)

@uri-canva
Copy link
Contributor Author

Yeah there might be a limit to how far we can pursue this path. For libc and toolchains that share one frontend, like gcc and clang, if the compiler is configured correctly, with the right builtins, --print-file-name / -print-prog-name should work to an extent.

@AndersonTorres yes, that is a good idea, I will make a post for Bazel as I'm most familiar with that.

@eli-schwartz
Copy link

Hardcoding a libc is, I guess, part of cmake's "find libraries or headers" methods? That sounds like a very bad thing to hardcode at build time, since it would preclude correctly using something like musl-gcc, and makes little sense for cross-compiling... Meson looks this up from the compiler, yeah.

@uri-canva
Copy link
Contributor Author

Posted this issue for upstream bazel: bazelbuild/bazel#16009.

@layus
Copy link
Member

layus commented Sep 2, 2022

I think this is an extremely valuable goal to pursue. I am currently cross-compiling packages for embedded platforms that do not use nix at all. The hassle resides in 1. coaxing upstream binary toolchain into a valid stdenv, and 2. fixing all the nixpkgs pacakges that do not compile correctly with --rootfs.
Coaxing the stdenv means splitting the toolchain into a libc, a cc and the bintools that work together.

Do I understand correctly that what you wnat is introduce flexibility to specify these tools dynamically at runtime. Any idea on how to implement this in practice so that we can still specify the tools inside nix when it makes sense ?

@uri-canva
Copy link
Contributor Author

Yes, that's what I have in mind. When using those binaries within nix we could either:

  1. Wrap the tools with wrapper that provide the appropriate default configuration either through environment variables or command line flags. For example the wrapper could set CC etc if it isn't set.
  2. Pass the appropriate default configuration via the functions you'd normally use in nixpkgs. For example bazel would not be very usable from within nixpkgs, but buildBazelPackage would take care of calling bazel in the right way.

@uri-canva
Copy link
Contributor Author

This is also a good approach to take for some of the more complex build tools, but not sure if it's generally applicable: https://github.com/tweag/jupyterWith.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/my-issues-when-pushing-nixos-to-companies/28629/10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants