Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround 3.3.0 crash on aarch64 #439

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

osyoyu
Copy link
Contributor

@osyoyu osyoyu commented Feb 16, 2024

Ruby 3.3.0 has a bug which crashs many practical programs on linux-aarch64 (especially on Linux VMs on macOS, e.g. Docker Desktop).
https://bugs.ruby-lang.org/issues/20085

This bug is fixed in upstream and is planned to be backported, but no date is given yet for the 3.3.1 release.
ruby/ruby#9371

This patch workarounds this bug by passing ASFLAGS to ./configure as described in https://bugs.ruby-lang.org/issues/20085#note-5 .

@osyoyu
Copy link
Contributor Author

osyoyu commented Feb 16, 2024

# unpatched version
% docker run --rm ruby:3.3-bullseye ruby -e 'Fiber.new{}.resume'
Unable to find image 'ruby:3.3-bullseye' locally
(snip)
Digest: sha256:89a45a72834c54c87ceabf445e96d95af54299b74fe89e4fb943a1bcefddf195
Status: Downloaded newer image for ruby:3.3-bullseye
-e:1: [BUG] Segmentation fault at 0x000affff8d6f3b90
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [aarch64-linux]

-- Control frame information -----------------------------------------------
c:0003 p:---- s:0010 e:000009 CFUNC  :resume
c:0002 p:0007 s:0006 E:000e70 EVAL   -e:1 [FINISH]
c:0001 p:0000 s:0003 E:001be0 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
-e:1:in `resume'

-- Threading information ---------------------------------------------------
Total ractor count: 1
Ruby thread count for this ractor: 1

-- Machine register context ------------------------------------------------
  x0: 0x0000aaaaf6063a70  x1: 0x0000aaaaf62fe740  x2: 0x0000ffffd2176e30
  x3: 0x0000ffff7342ef60  x4: 0x0000ffff7342f018  x5: 0x0000ffff7344f000
  x6: 0x0000000000081000  x7: 0x00000000000a1000 x18: 0x0000000000000000
 x19: 0x0000000000000000 x20: 0x0000000000000000 x21: 0x0000000000000000
 x22: 0x0000000000000000 x23: 0x0000000000000000 x24: 0x0000000000000000
 x25: 0x0000000000000000 x26: 0x0000000000000000 x27: 0x0000000000000000
 x28: 0x0000000000000000 x29: 0x0000000000000000  sp: 0x0000ffff7342f000
 fau: 0x000affff8d6f3b90

-- C level backtrace information -------------------------------------------

# patched version
% docker run --rm -it 3f932a4a5a75465375ff790a11f8c4a95804bce779bb27edfe6e6 ruby -e 'Fiber.new{}.resume'

%

@LaurentGoderre
Copy link
Member

Can the check also limits it to the arm64 architecture to be safer?

@tianon
Copy link
Member

tianon commented Feb 16, 2024

Doing that correctly is probably going to require refactoring our "distro arches" data into a function -- I'm happy to make that change. 😅 🙇

ruby/ruby#9385 is the official (merged) backport of this, which makes me feel OK about applying it. It might even actually be worth applying the patch directly instead, which would remove any concerns about applying this only to arm64. 👀

Dockerfile.template Outdated Show resolved Hide resolved
Ruby 3.3.0 has a bug which crashes many programs on arm64.
https://bugs.ruby-lang.org/issues/20085

This bug is fixed in upstream (ruby/ruby#9371)
and is planned to be backported, but no date is given yet for the 3.3.1
release.

This patch workarounds this bug by applying the upstream fix/backport in
ruby/ruby#9385 .

Co-authored-by: Tianon Gravi <admwiggin@gmail.com>
oakbow added a commit to icare-jp-oss/ruby that referenced this pull request Feb 19, 2024
oakbow added a commit to icare-jp-oss/ruby that referenced this pull request Feb 19, 2024
Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! ❤️

@tianon tianon merged commit a27888b into docker-library:master Feb 20, 2024
25 checks passed
docker-library-bot added a commit to docker-library-bot/official-images that referenced this pull request Feb 20, 2024
Changes:

- docker-library/ruby@a27888b: Merge pull request docker-library/ruby#439 from osyoyu/fix-crash-3.3.0
- docker-library/ruby@cfdac1e: Workaround 3.3.0 crash on arm64
@osyoyu
Copy link
Contributor Author

osyoyu commented Feb 22, 2024

@tianon @LaurentGoderre Has this got released? I'd thought merging docker-library/official-images#16285 would cut a release, but the crash still reproduces on ruby:3.3 and I'm not seeing the flag showing up in RbConfig::CONFIG. Did I miss something?

% docker run --rm ruby:3.3-bookworm ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
"" # expected to be "-mbranch-protection=pac-ret"

(not in a hurry, but just wanted to confirm)

@yosifkit
Copy link
Member

They did get rebuilt, but it seems that (with the configure flags we currently set) the ruby patch only applies the flag in alpine-based images.

$ docker run -it --rm --platform=linux/arm64 ruby:3.3-alpine ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
Unable to find image 'ruby:3.3-alpine' locally
3.3-alpine: Pulling from library/ruby
bca4290a9639: Pull complete
6485573fb761: Pull complete
c1a88e913070: Pull complete
06b396efb7e4: Pull complete
648c738e4580: Pull complete
Digest: sha256:6181164fb38d9992517514317e3df6420f0cec3a401616ec65479eaab62fd31d
Status: Downloaded newer image
"-mbranch-protection=pac-ret"

$ docker run -it --rm --platform=linux/arm64 ruby:3.3 ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
Unable to find image 'ruby:3.3' locally
3.3: Pulling from library/ruby
c2964e85ea54: Pull complete
d3436c315a5d: Pull complete
603ae72c83b1: Pull complete
bcabfc6c415b: Pull complete
97783dc270ae: Pull complete                                                     03845a4dfdd9: Pull complete
158fc18a11a4: Pull complete
Digest: sha256:94cb8c8b8e09dad143148ce698828fa904793fde26b5c7f60b3ae17cecf7c1ad
Status: Downloaded newer image for ruby:3.3
""

@tianon
Copy link
Member

tianon commented Feb 23, 2024

Maybe something about the upstream patch is incomplete?

@mintuhouse
Copy link

Patch seems to have been correctly applied in bullseye too (only missing on bookworm)

docker run --rm ruby:3.3-bullseye ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
"-mbranch-protection=pac-ret"

@tianon
Copy link
Member

tianon commented Feb 26, 2024

We definitely apply the patch in the bookworm images:

# workaround crash on arm64
# https://bugs.ruby-lang.org/issues/20085
# https://github.com/ruby/ruby/pull/9385 <- https://github.com/ruby/ruby/pull/9371
wget -O 'arm64-fix.patch' 'https://github.com/ruby/ruby/commit/7f97e3540ce448b501bcbee15afac5f94bb22dd9.patch?full_index=1'; \
echo '86bc65415fd62cb2272a4df249f39fb79db15617ad05c540e05a22f02eae73b3 *arm64-fix.patch' | sha256sum --check --strict; \
patch -p1 -i arm64-fix.patch; \
rm arm64-fix.patch; \

If we click through the various layers of https://explore.ggcr.dev/?image=ruby:3.3-bullseye down into the attestation for arm64:

We can also see it in the "image config" / history:

Hence my wondering if the patch itself was incomplete and doesn't fix this correctly in all cases where it should.

@KevinCarterDev
Copy link

Patch seems to have been correctly applied in bullseye too (only missing on bookworm)

docker run --rm ruby:3.3-bullseye ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
"-mbranch-protection=pac-ret"

It's also missing on the slim images

% docker run --rm ruby:3.3-slim-bullseye ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
""

% docker run --rm ruby:3.3-slim-bookworm ruby -e 'pp RbConfig::CONFIG["ASFLAGS"]'
""

@tianon
Copy link
Member

tianon commented Feb 28, 2024

To clarify, the patch is not missing. The patch is definitely 100% downloaded and applied (and slim being affected in the same way is also entirely by design -- they're intentionally built in as close to exactly the same way as we can manage, enforced by using the same underlying Dockerfile template). The patch does not seem complete, however, such that the bug is still present even with the patch applied.

@hachi8833
Copy link
Contributor

I still encounter the same issue ruby/ruby#9371 with ruby:3.3.0-slim-bookworm, so I had to change the image to ruby:3.3.0-bullseye to avoid the SEGV.

@tianon
Copy link
Member

tianon commented Feb 29, 2024

Reported upstream with more detail in https://bugs.ruby-lang.org/issues/20085#note-29

@navels
Copy link

navels commented Feb 29, 2024

Reported upstream with more detail in https://bugs.ruby-lang.org/issues/20085#note-29

Good luck getting a reply!

Well that was quick lol

@tianon
Copy link
Member

tianon commented Feb 29, 2024

Can someone who is able to reproduce please help test #440?

docker build --pull 'https://github.com/docker-library/ruby.git#refs/pull/440/merge:3.3/bookworm'

@navels
Copy link

navels commented Feb 29, 2024

Success!

> ruby -e "Fiber.new{}.resume"
> 

@osyoyu
Copy link
Contributor Author

osyoyu commented Mar 9, 2024

Sorry for getting back late. Everything works perfectly now! Thank you very much

martin-g pushed a commit to martin-g/docker-official-images that referenced this pull request Apr 3, 2024
Changes:

- docker-library/ruby@a27888b: Merge pull request docker-library/ruby#439 from osyoyu/fix-crash-3.3.0
- docker-library/ruby@cfdac1e: Workaround 3.3.0 crash on arm64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants