Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodetool is broken in e2e tests #1342

Closed
olim7t opened this issue Jun 4, 2024 · 3 comments · Fixed by #1343
Closed

nodetool is broken in e2e tests #1342

olim7t opened this issue Jun 4, 2024 · 3 comments · Fixed by #1343
Labels
bug Something isn't working done Issues in the state 'done' testing

Comments

@olim7t
Copy link
Contributor

olim7t commented Jun 4, 2024

A few e2e tests have started failing in CI, seemingly since 87bed5f. The logs show this error from nodetool:

failed to execute nodetool status on cluster1-real-dc2-default-sts-0: nodetool:
Failed to connect to '127.0.0.1:7199' - URISyntaxException:
'Malformed IPv6 address at index 7: rmi://[127.0.0.1]:7199'. (exit status 1)

This is caused by CASSANDRA-17581, which is related to changes in the way the JDK parses RMI URLs.

It's not immediately clear what caused this 2-year-old bug to suddenly resurface, but a pragmatic fix is to upgrade the Cassandra version in our tests. We currently use 4.0.1 for the 4.0.x tests, the bug was fixed in 4.0.4, and the latest is 4.0.13.

@olim7t olim7t added bug Something isn't working testing labels Jun 4, 2024
olim7t added a commit to olim7t/k8ssandra-operator that referenced this issue Jun 4, 2024
@emerkle826
Copy link
Contributor

This was fixed in the Cassandra images a while ago:
k8ssandra/management-api-for-apache-cassandra@a7e16a3
Not sure why this would pop back up.

@emerkle826
Copy link
Contributor

After discussing this with @olim7t , this might be an issue with the UBI-based images, as those are built from scratch on new Management API releases. The Ubuntu-based images pull from the Official DockerHub images, which don't usually change once published. This might be an issue where the base UBI image we build from is getting a JDK that's affected, and the work-around fro nodetool is not being applied because it's checking the Cassandra version, not the JDK version.
If this is the case, I'll open a ticket in Management API to get it fixed correctly there.

@emerkle826
Copy link
Contributor

For some background, the issue is documented well (with the links from the discussion) here:
https://issues.apache.org/jira/browse/CASSANDRA-17581
The JDK bug only affected Cassandra versions 3.11.12 and 4.0.3 (that this project is interested in anyway).

There was a work around added for both Ubuntu based images and UBI based images in this comit:
k8ssandra/management-api-for-apache-cassandra@a7e16a3, so it shouldn't have popped back up. I've tested a fre of the Management API images locally and the results seem.... unpredictable.

I was able to reproduce the bug on version 4.0.3, 4.0.4 and 4.0.5, with management API version 0.1.79. That didn't make any sense because the bug was fixed in version 4.0.4, and that fix would be applicable in both Ubuntu images and UBI images. So when I tried to reproduce it again and capture the details (Cassandra version, Management API version, JDK version), I wasn't able to in any of them. I triple checked the images to make sure I wasn't accidentally using the wrong one.

All of the images have the same version of the JDK (4.0.3 through 4.0.5, with Management API versions v0.1.76-v0.1.79):

openjdk 11.0.23 2024-04-16 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.23.0.9-2) (build 11.0.23+9-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.23.0.9-2) (build 11.0.23+9-LTS, mixed mode, sharing)

So I don't think it's the JDK.

I also don't think it's Cassandra as the code has been the same since it was fixed 2 years ago.

I know the PR to fix this is updating the image versions. Hopefully that works, but I can't figure out why it's necessary....

@adejanovski adejanovski added the review Issues in the state 'review' label Jun 5, 2024
adejanovski added a commit that referenced this issue Jun 6, 2024
* Upgrade C* 4.0 in e2e tests to 4.0.13 (fixes #1342)

* Fix operator upgrade test

* Always allow cass-operator autoupdates

---------

Co-authored-by: Alexander Dejanovski <alex.dejanovski@datastax.com>
@adejanovski adejanovski added done Issues in the state 'done' and removed review Issues in the state 'review' labels Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working done Issues in the state 'done' testing
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants