-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak, C2 compiler #1190
Comments
@shalseth What is your app doing roughly? Is it a pure Java app or does it have Scala, other JVM languages or any native components? |
@karianna , thank for you reply. No other native components, just pure Java. It's a large web application with many functions, running in Tomcat. In this particular case however, it's purely MySQL database transactions using JDBC. Fetching data in a ResultSet, and serializing to JSON using gson. What strikes me as weird is that we don't see anything like this on Temurin 21.0.1 / 21.0.2 after countless hours in production across several instances, and we can reproduce this in a few minutes on 21.0.3-21.0.5. Currently only from random clicking around in our application, but perhaps we can isolate it further at some point. I did manage to get some more logs today when reproducing the issue. Used the following flags: -Xlog:jit+compilation=debug Found a compilation task that takes 9995ms , and eats memory at the time I observe memory spikes, seen from outside of the docker container. In this case it consumed about +4GB of RSS memory. Xmx=1G. -XX:+PrintCompilation / -XX:+PrintCompilation2
hotspot_pid1.log (-XX:+LogCompilation)
See attached debug log from the compiler thread as well. |
Thanks for the extra details - reported upstream at https://bugs.openjdk.org/browse/JDK-8343322 |
@karianna, following up with additional information based on upstream questions. I don't have an account at bugs.openjdk.org, so I would appreciate if you could forward this information. Question: Answer: I've built a version of OpenJDK up until the commit before the one above: This version does not display any of the mentioned symptoms. I will gladly assist with reproducing the issue, and trying out any patches / changes. Builds / commits: Question: Answer: Logs: Log output from test run from a version of OpenJDK 21.0.3+9 that we've compiled today.
|
Thanks a lot for narrowing it down @shalseth. I'll put that information into JDK-8343322. |
Is this still reproducible with JDK-8340824? @shalseth maybe you can take the patch and apply it to see if it helps: |
@TobiHartmann, JDK-8340824 looks really promising so far! I tried applying the patch to jdk-21.0.3-ga, which is quite old compared to the new patch, but it applied cleanly, and I see an immediate improvement. Then did the same test with the patch on jdk-21.0.6+1 (commit: 7dc0f7a64224d37f639ab8e8da2c1aa3295cc92e). In both cases I can notice a slightly larger memory footprint, with the patch applied, compared to 21.0.2, but it's negligible, and to the point where I wouldn't consider it a problem, or even know about it in the first place. With the patch applied, the behaviour is much more in line with 21.0.1/21.0.2, and saves around 5-6GB in peak memory consumption. Now during my test it goes from 500MB -> 2000MB -> 500MB during ~10 seconds, which is completely fine. Note that this is an early report, so I will do some more testing, perhaps even with production workload. |
That's great news, thanks for verifying @shalseth. I'll keep JDK-8343322 open for now until final feedback from you.
That's up to the OpenJDK maintainers to decide, I'm representing the Oracle JDK here and we backported the fix to Oracle JDK 21.0.7 already. |
@shalseth I'll propose it for openjdk 21 |
Thanks, Roland! |
Been running jdk-21.0.6+1 (commit: 7dc0f7a64224d37f639ab8e8da2c1aa3295cc92e) with the patch from JDK-8340824 in production on a small subset of containers since yesterday. We've had no issues so far. The original issue would present itself within max 1-2 hours, so I'm closing the issue. I'm planning to stick with Temurin 21.0.2 in production for now, and upgrade when a Temurin 21.0.X build based on OpenJDK with the patch applied becomes available. Will do the same testing then. Thanks a lot for your help and assistance :) |
Great, thanks again! |
Please provide a brief summary of the bug
As of Temurin 21.0.3 we are experiencing excessive memory usage (and oom-killer) from our Java Tomcat containers.
This is off-heap memory usage, and we can easily trigger a 10GB mem usage (real mem usage, memstat:total_rss) with a Xmx=1G. From our current testing, the mem usage will go as high as whatever we configure as the docker container memory limit.
From a total_rss usage of 1GB, the mem usage within ~10 seconds goes up to the limit 10GB.
Application memory leaks are not a factor, we have checked heap with jmap, just to make sure.
We have no issues on Temurin 21.0.1 og 21.0.2, only starting with 21.0.3. Same issue also present on latest 21.0.5.
We have isolated the memory usage to the C2 Compiler thread, based on oom-killer syslog output, and with Native Memory Tracking.
On a lightly loaded production node we experience this after 1-2 hours, so we can quite easily reproduce the issue.
We are using Tomcat images from https://hub.docker.com/_/tomcat.
Ideally we should have some code or easy way to reproduce this outside of our application, but our application is quite large, and we don't know exactly what triggers the issue.
Realize it will be hard to track down based on this bug report, but I will post it anyway, in case others experience similar problems and / or there's some additional logs or debug output we could provide.
Did you test with the latest update version?
Please provide steps to reproduce where possible
Expected Results
Memory usage similar to Temurin 21.0.1 - 21.0.2
Actual Results
Consumes all available memory up to docker container limit.
What Java Version are you using?
openjdk 21.0.5 2024-10-15 LTS OpenJDK Runtime Environment Temurin-21.0.5+11 (build 21.0.5+11-LTS) OpenJDK 64-Bit Server VM Temurin-21.0.5+11 (build 21.0.5+11-LTS, mixed mode, sharing)
What is your operating system and platform?
Ubuntu 20.04 docker host
Container image based on Ubuntu 22.04. Dockerhub: tomcat:9.0.90-jdk21-temurin-jammy
How did you install Java?
Docker image from https://hub.docker.com/_/tomcat, which uses eclipse-temurin:21-jdk-jammy
Did it work before?
Did you test with other Java versions?
Relevant log output
Manual logging total_rss per second (from cgroup v1 memstat)
Test container Xmx=1G, container limit 2,5GB. oom-killer disabled.
Tue Oct 29 14:00:00 CET 2024: Total RSS: 933.01 MB
Tue Oct 29 14:00:01 CET 2024: Total RSS: 1071.18 MB
------ Problem starts here ------
Tue Oct 29 14:00:02 CET 2024: Total RSS: 1457.25 MB
Tue Oct 29 14:00:03 CET 2024: Total RSS: 1862.92 MB
Tue Oct 29 14:00:04 CET 2024: Total RSS: 2231.92 MB
Tue Oct 29 14:00:05 CET 2024: Total RSS: 2451.58 MB
------ Reached container mem limit 2,5GB ------
JVM freezes, out of memory, and cannot create new threads:
java.io.IOException: Cannot allocate memory
java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
Tue Oct 29 14:00:06 CET 2024: Total RSS: 2451.58 MB
Tue Oct 29 14:00:08 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:09 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:10 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:11 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:12 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:13 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:14 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:15 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:16 CET 2024: Total RSS: 2456.09 MB
Tue Oct 29 14:00:17 CET 2024: Total RSS: 2456.09 MB
The text was updated successfully, but these errors were encountered: