-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agentbeat packaging failures: aarch64-linux-gnu/bin/ld.gold: internal error in maybe_apply_stub, at ../../gold/aarch64.cc:5407 #41270
Comments
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Quoting @rdner on Slack with more context on how to reproduce this:
|
@cmacknz as this is only happening when we are bumping the GCP SDK to a newer version, should we ask observability to investigate? |
It happened with both the AWS SDK bump and the GCP SDK bump, but I'm not sure we can conclude that it has something to do with cloud SDKs. Both of those PRs have a large set of dependencies and it's more likely there is some conflict in an indirect dependency (possibly different each time) triggering a bug in the linker. |
This is very likely a bug in the gcc cross-compiler or gold. Checking the $ ld.gold --version
GNU gold (GNU Binutils for Debian 2.28) 1.14
Copyright (C) 2017 Free Software Foundation, Inc.
$ aarch64-linux-gnu-gcc --version
aarch64-linux-gnu-gcc (Debian 6.3.0-18) 6.3.0 20170516
Copyright (C) 2016 Free Software Foundation, Inc.
$ gcc --version
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
Copyright (C) 2016 Free Software Foundation, Inc. As you can see, the toolchain is gcc 6, from 2017. Gcc stable is currently in version 14. We should probably focus on getting these up-to-date, it is likely that it will improve compatibility in general. |
Agree with @mauri870's comment; bumping the linker and gcc versions is definitely a good idea. I've spent some time debugging when I saw it in the other Go 1.23.2 upgrade PR. Initially, I couldn't reproduce it, and everything was working fine even when I tried different Go versions. I was using the command The reproducer shared here does replicate the issue on my setup as well: #41270 (comment) I compared the Docker commands: the one invoked internally with this command: ( After experimenting for a while, I noticed that the When So, here's another reproducer:
This fails too, but I believe the culprit here is Line 70 in 7be47da
[ Update: I tried exactly with which flag we are getting the issue; it is the -N ; -l (disabling inlining works) but when disabling the optimization with -N , it breaks ]
It seems there's an issue with the linker when compiler optimizations are disabled. Could someone also see if I think this also explains why the CI passes when packaging agentbeat because |
For snapshot builds I think we have DEV=true on, but for not staging. beats/.buildkite/packaging.pipeline.yml Lines 82 to 91 in 3492089
I need to follow more to see where the snapshot DRA artifacts actually get used, if these end up in the official snapshot images we need to turn these off, otherwise the way they are built doesn't match what we eventually release. Upgrading the cross toolchain is a good idea, I recall we were limited by the version of glibc we needed to build the 7.17 branch on all supported platforms, but there have been a few revisions to the support matrix since then. |
@cmacknz is there any case where we need |
It enables using a debugger. I would think locally rebuilding with DEV=true would be acceptable, since this capability is not in the release binaries anyway. |
Packaging with `DEV=true` adds additional Go flags that sometimes lead to linker failures using the old versions of `ld.gold`. See elastic#41270
According to our support matrix https://www.elastic.co/support/matrix, we still support Debian 10 (released on 2019-07-06, EOL 2022-09-10) in the latest version of Beats (8.15.x). This is the main reason why we're still crossbuilding using this image docker.elastic.co/beats-dev/golang-crossbuild:1.22.8-darwin-arm64-debian10. AFAIK, there is a strict dependency on a certain glibc version when we build the Beats binaries and this is why we have to use Debian 10 for building them. For more details, please read this thread #34921 (comment) @shmsr thank you very much for your investigation, I opened a PR to remove the This should prevent such failures in the future. |
We're dropping support for Debian 10, so no need to crossbuild using the outdated image anymore. The old linker in Debian 10 caused a packaging issue with some Go dependency updates elastic#41270 So, this update should also help with that.
We're dropping support for Debian 10, so no need to crossbuild using the outdated image anymore. The old linker in Debian 10 caused a packaging issue with some Go dependency updates #41270 So, this update should also help with that. This also updates the statically linked glibc from 2.28 to 2.31.
We're dropping support for Debian 10, so no need to crossbuild using the outdated image anymore. The old linker in Debian 10 caused a packaging issue with some Go dependency updates #41270 So, this update should also help with that. This also updates the statically linked glibc from 2.28 to 2.31. (cherry picked from commit 4140d15)
We're dropping support for Debian 10, so no need to crossbuild using the outdated image anymore. The old linker in Debian 10 caused a packaging issue with some Go dependency updates #41270 So, this update should also help with that. This also updates the statically linked glibc from 2.28 to 2.31. (cherry picked from commit 4140d15) Co-authored-by: Denis <denis.rechkunov@elastic.co>
* Fix like elastic/beats#41270 (#5868) (cherry picked from commit c9cd580) # Conflicts: # .buildkite/scripts/steps/integration-package.sh # .buildkite/scripts/steps/k8s-extended-tests.sh * Update integration-package.sh * Update k8s-extended-tests.sh --------- Co-authored-by: Lee E Hinman <57081003+leehinman@users.noreply.github.com> Co-authored-by: Pierre HILBERT <pierre.hilbert@elastic.co>
I managed to reproduce this issue today on a ARM64 MacBook:
The same error happens with Go 1.23.0 and GO 1.22.10. I'm using |
The following error has been observed inconsistently in Beats packaging after dependencies updates of the AWS and GCP SDKs, respectively.
This error was resolved by reverted the following two unrelated PRs:
The go.mod changes do not overlap for those two PRs. There is some problem in the dependency graphs that doesn’t jump out obviously that is causing this.
I suspect this is likely related to a change in https://github.com/elastic/golang-crossbuild that only reproduces under specific but infrequent conditions. Possibly it is a bug in
aarch64-linux-gnu-gcc
and updating the version included in the crossbuild image would resolve it.The text was updated successfully, but these errors were encountered: