-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8340401: DcmdMBeanPermissionsTest.java and SystemDumpMapTest.java fail with assert(_stack_base != nullptr) failed: Sanity check #21091
Conversation
👋 Welcome back stooke! A progress list of the required criteria for merging this PR into |
@stooke This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 69 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@tstuefe, @kevinjwalls, @dholmes-ora) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
Would be good to go ahead with this. |
Nothing interesting happened. Meaning, no crashes! |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is good enough. The setting of stack size can race the setting of stack_base - you could observe stack size set, but stack base still unset.
What I would do instead is:
give us a new function in Thread that, by contract, does not assert but returns NULL (similar to e.g. Thread::curren_or_null()
):
address stack_base_or_null() const { return _stack_base; }
use this in vma_touches_thread_stack
to skip NULL base.
Alternatively, in record_stack_base_and_size
, we need a store store barrier between the two writes, and store stack size first.
I think we have understood why this is happening on Windows. See David's comment in the JBS issue. |
I guess this could be a real issue on Windows-aarch64. The storestore barrier would work for me. But we probably need an audit of every JavaThread iterator use in case the same problem can occur elsewhere. |
Mailing list message from Thomas Stüfe on hotspot-runtime-dev: On Fri, Sep 20, 2024 at 12:44?PM David Holmes <dholmes at openjdk.org> wrote:
I don't think this is limited to JavaThread Eg in G1: - parent thread calls WorkerThreads::set_active_workers Admittedly, this is improbable. But why risk it? We don't lose anything by ------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this update in a local tier1 linux and windows, and in the CI tier3 and tier5 windows where I'm seeing all passes, lots of ZGC usage (which remains a more likely way to trigger this, but not essential).
Simon if this attracts another reviewer, I would be happy to see this go ahead and calm down the CI noise (every tier5 run turns up a few failures).
Like the other test change, if we want to meditate further on it and follow up with a better solution that's great.
Have you tested with Windows arm, too?
Can't we just problem-list it for now? I would like that better than adding a solution that we already know won't work for all cases (which would make e.g. hunting the bug tail when backporting more confusing). |
No. I get the theory, but has anybody seen an issue there? |
Yes, Simon is working on a solution. This may just be a case of too-many-reviewer-opinions-paralyses. Can you problemlist the test or should we do it? |
Would be great if you could, thanks. |
okay: #21109 |
@tstuefe whether other kinds of thread iterators have potential issues with the state of the threads they may observe is not relevant to the question of whether the proposed check goes in the iteration loop or the HANDLE_THREAD macro - we don't process any GC threads with HANDLE_THREAD. The JavaThread iterator can return a thread that has not yet set its stack so the code using the iterator needs to deal with that if it might be an issue. |
One solution may be to stop creating the threads in the suspended state and use the same protocol for initialization that is used for non-Windows threads. At least that way we get consistent behaviour. I wouldn't expect Simon to undertake that task though. |
We iterate GC threads ( jdk/src/hotspot/share/nmt/memMapPrinter.cpp Line 226 in 9b54225
|
@tstuefe I think this is a thread lifecycle management issue and so anyone using the JavaThread iterator or GCworkersDo needs to be aware of the possibility of a thread having a null stackbase. Having the check as close as possible to the iteration loop (whichever kind) makes this more visible, rather than hiding it inside another function. I would rather |
Okay, that makes sense considering that people tend to copy paste solutions. I leave this up to Simon, then (also the question whether he wants to tackle rewriting Windows thread startup in a manner that we prevent this situation altogether). After all, the test now is problemlisted, so the urgency is gone. |
As I've heard nothing further here, I've filed JDK-8342482 and will work on it. |
JDK-8342482 didn't work out. |
Oh you are right, I forgot that. No, I think stack_base_or_null is fine in that case. |
This was closed in error. |
@stooke Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information. |
@tstuefe I have implemented Thread::stack_base_or_null(), and I expect the Gitlab gates to pass. On my VM, I have removed the tests from ProblemList files, and they pass.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
@stooke if there is any test in the ProblemList file that lists 8340401, then the PL needs to be updated. Thanks |
@dholmes-ora , done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updates
/integrate |
/sponsor |
Going to push as commit b0c40aa.
Your commit was automatically rebased without conflicts. |
@dholmes-ora @stooke Pushed as commit b0c40aa. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
SystemDumpMapTest still fails after this - see JDK-8346717 |
@dholmes-ora my apologies - this failure (which is timing related) did not sho up in my testing. I will be opening PR #22870 as soon as it passes the GH gates. |
This PR addresses JDK-8340401 by adding a test for a thread with no stack currently assigned.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21091/head:pull/21091
$ git checkout pull/21091
Update a local copy of the PR:
$ git checkout pull/21091
$ git pull https://git.openjdk.org/jdk.git pull/21091/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 21091
View PR using the GUI difftool:
$ git pr show -t 21091
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21091.diff
Using Webrev
Link to Webrev Comment