Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGBUS errors on Mac ARM with 17.0.9+9 #938

Closed
alexet opened this issue Nov 2, 2023 · 8 comments
Closed

SIGBUS errors on Mac ARM with 17.0.9+9 #938

alexet opened this issue Nov 2, 2023 · 8 comments
Labels
bug Something isn't working stale Waiting on OP

Comments

@alexet
Copy link

alexet commented Nov 2, 2023

Please provide a brief summary of the bug

Since updating to 17.0.9 we have seen multiple SIGBUS issues but not reproducibly on macos. We have not seen this with 17.0.8.

Please provide steps to reproduce where possible

I haven't managed to reproduce this reliably nor with any reduced amount of code.

The code in question is basically:

ByteBuffer buf = ByteBuffer.allocateDirect(bufferSize);
IntBuffer ibuf = buf.asIntBuffer();
int[] arr = new int[3];

// Some code to read to fill the buffer from a file

// This is the line in the stacktrace. This is done in a loop,
ibuf.get(intTuple)

However it seems that it isn't quite enough to run that in a loop.

Expected Results

Not a jvm crash.

Actual Results

Log highlights

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0x0000000107968c20, pid=85389, tid=38675
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.9+9 (17.0.9+9) (build 17.0.9+9)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.9+9 (17.0.9+9, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64)
# Problematic frame:
# V  [libjvm.dylib+0x968c20]  MarkActivationClosure::do_code_blob(CodeBlob*)+0x3c
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#
---------------  T H R E A D  ---------------

Current thread (0x0000000129874a00):  JavaThread "pool-1-thread-1" [_thread_in_vm, id=38675, stack(0x0000000289270000,0x0000000289473000)]

Stack: [0x0000000289270000,0x0000000289473000],  sp=0x00000002894715e0,  free space=2053k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x968c20]  MarkActivationClosure::do_code_blob(CodeBlob*)+0x3c
V  [libjvm.dylib+0x9a83d8]  JavaThread::nmethods_do(CodeBlobClosure*)+0xa8
V  [libjvm.dylib+0x41bd98]  HandshakeState::process_by_self(bool)+0x340
V  [libjvm.dylib+0x85fccc]  SafepointMechanism::process(JavaThread*, bool)+0x5c
V  [libjvm.dylib+0x9dce4c]  Unsafe_CopySwapMemory0(JNIEnv_*, _jobject*, _jobject*, long, _jobject*, long, long, long)+0xe8
J 3481  jdk.internal.misc.Unsafe.copySwapMemory0(Ljava/lang/Object;JLjava/lang/Object;JJJ)V java.base@17.0.9 (0 bytes) @ 0x00000001114e0f54 [0x00000001114e0ec0+0x0000000000000094]
V  [libjvm.dylib+0xc6f428]  TemplateInterpreter::_active_table+0x0

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 3481  jdk.internal.misc.Unsafe.copySwapMemory0(Ljava/lang/Object;JLjava/lang/Object;JJJ)V java.base@17.0.9 (0 bytes) @ 0x00000001114e0f54 [0x00000001114e0ec0+0x0000000000000094]
J 3572 c2 com.semmle.inmemory.relations.IteratorLoader.tryReadTuple()V (28 bytes) @ 0x00000001114f667c [0x00000001114f6540+0x000000000000013c]
J 3566 c1 com.semmle.inmemory.relations.IteratorLoader.next()[J (27 bytes) @ 0x0000000109f9860c [0x0000000109f983c0+0x000000000000024c]

Full log
hs_err_pid85389.log

What Java Version are you using?

openjdk version "17.0.9" 2023-10-17

What is your operating system and platform?

OS: macOS 11.7

Architecture: aarch64

How did you install Java?

tar from adoptium then

jlink \
                      --output "jdk-codeql" \
                      --module-path jmods/ \
                      --compress=2 \
                      --strip-debug \
                      --add-modules java.management,java.instrument,java.naming,jdk.management.agent,jdk.zipfs,jdk.compiler,java.xml,java.sql,java.rmi,java.desktop,jdk.unsupported,jdk.crypto.ec

Did it work before?

Yes, It works with 17.0.8.1

Did you test with the latest update version?

We haven't seen it with Java 21 but we didn't do enough testing with Java 21 to say that it inst an issue.

Did you test with other Java versions?

No response

Relevant log output

Full log attached above.

@alexet alexet added the bug Something isn't working label Nov 2, 2023
@karianna
Copy link
Contributor

karianna commented Nov 2, 2023

@alexet It looks like your code is usuing JNI or a library that does some sort of off-heap storage/processing - is that correct?

@alexet
Copy link
Author

alexet commented Nov 7, 2023

We are using nio to fill the byte buffer but we are using no JNI ourselves. There is no reason for us to be using allocate direct in this instance but it can be faster with nio.

@karianna
Copy link
Contributor

Ah OK, I think I see, this is being caused by semmle which is a code scanning agent that got acquired by GitHub. I think the recommendation here is to get off the unsupported original semmle technology and move to GitHub advanced security.

Recommend you contact GitHub / Semmle team (if they still exist).

@alexet
Copy link
Author

alexet commented Nov 14, 2023

We are that team, the agent is not involved at all in this case and is not loaded.

This is an test where we are running plain java code. Unfortunately due to the lack of ability to reduce it it is too much internal code to post here.

I can confirm that this issue seems to be fixed by jdk 21 (but using jdk 21 is not really possible for a bit due to other issues).

The issue only seems to be present on some sets of machines. The set that succeeds is macos 13 and the set that fails is macos 11, but there are other changes between those ci runners that makes it harder to pin down the changes.

@karianna
Copy link
Contributor

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 3481  jdk.internal.misc.Unsafe.copySwapMemory0(Ljava/lang/Object;JLjava/lang/Object;JJJ)V java.base@17.0.9 (0 bytes) @ 0x00000001114e0f54 [0x00000001114e0ec0+0x0000000000000094]
J 3572 c2 com.semmle.inmemory.relations.IteratorLoader.tryReadTuple()V (28 bytes) @ 0x00000001114f667c [0x00000001114f6540+0x000000000000013c]

This is the section where you're calling out to the Unsafe library (and manipulating memory out side of the JVM's control as such). I'm not sure that can be refactored, but perhaps it's not quite interacting with older Mac OS X versions. Can that be refactored somehow?

@alexet
Copy link
Author

alexet commented Nov 14, 2023

The call stack is missing multiple inlined entries which can be seen in the annotated dump of the JIT code. At the call instruction we can see it is annotated with the several inlined frames.

             ;*invokevirtual copySwapMemory0 {reexecute=0 rethrow=0 return_oop=0}
                      ; - jdk.internal.misc.Unsafe::copySwapMemory@33
                      ; - jdk.internal.misc.ScopedMemoryAccess::copySwapMemoryInternal@34
                      ; - jdk.internal.misc.ScopedMemoryAccess::copySwapMemory@14
                      ; - java.nio.IntBuffer::getArray@72
                      ; - java.nio.IntBuffer::get@39
                      ; - java.nio.IntBuffer::get@5
                      ; - com.semmle.inmemory.relations.AbstractLoader::read@79 (line 44)
                      ; - com.semmle.inmemory.relations.IteratorLoader::tryReadTuple@6 (line 61)

So the unsafe is an internal implementation detail of the jdk so we can't avoid it easily.

@karianna
Copy link
Contributor

Ah OK thanks. I think the challenge here is going to be creating a reproducer. Is there a simple Java program you can write that invokes com.semmle.inmemory.relations.IteratorLoader::tryReadTuple@6 (line 61)? Is it a library we can test against from Maven central or somewhere?

Copy link

We are marking this issue as stale because it has not been updated for a while. This is just a way to keep the support issues queue manageable.
It will be closed soon unless the stale label is removed by a committer, or a new comment is made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Waiting on OP
Projects
None yet
Development

No branches or pull requests

2 participants