Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConcurrentModificationException when late meter filters are added #5490

Merged
merged 2 commits into from
Sep 30, 2024

Conversation

mabartos
Copy link
Contributor

@mabartos mabartos commented Sep 24, 2024

@lenin-jaganathan You've already mentioned that it should be synchronized, but no action were taken (778f400#r1555315425)

Is it ok to lock it with the meterMapLock, or rather use a concurrent collection?

@pivotal-cla
Copy link

@mabartos Please sign the Contributor License Agreement!

Click here to manually synchronize the status of this Pull Request.

See the FAQ for frequently asked questions.

@pivotal-cla
Copy link

@mabartos Thank you for signing the Contributor License Agreement!

@lenin-jaganathan
Copy link
Contributor

I wouldn't recommend a Collections.synchronizedSet() if that is what you mean by "concurrent collection". We use contains on every interaction with the builder which could cause slowness.

We can use meterMapLock (but it was not originally meant for this map) but we should guard all the writes atleast. I also see that this collection is declared as not "Thread safe".

@mabartos
Copy link
Contributor Author

@lenin-jaganathan Thanks for the comment.

I wouldn't recommend a Collections.synchronizedSet() if that is what you mean by "concurrent collection". We use contains on every interaction with the builder which could cause slowness.

Ok.

We can use meterMapLock (but it was not originally meant for this map) but we should guard all the writes atleast. I also see that this collection is declared as not "Thread safe".

All the writes/removes should be guarded now.

@lenin-jaganathan Let me know, if there's a better approach. But resolving this bug is crucial for us as it would need to be propagated to Quarkus as well, so it'd be great to prioritize this issue.

@mabartos
Copy link
Contributor Author

The CI build-jdk11 failed due to a test failure related to some timing issue after providing synchronized() to unmarkStaleId() method. Not sure if it is related to this PR, or just the test is unstable.

@shakuzen
Copy link
Member

I re-ran the jdk11 job and it passed. I'm not sure if the failures were related to the change or not.
Note I'm holding off reviewing the specific changes in this PR and running benchmarks until I can get a concurrency test that reproduces the failure this is fixing. We want to make sure we've really fixed it and that it stays fixed through future changes. I'm out of time to continue working on that today but I'll get back to it tomorrow.

@mabartos
Copy link
Contributor Author

@shakuzen Got it, thanks!

@shakuzen
Copy link
Member

I was finally able to figure out a test that reproduces the CME. It wasn't obvious to me at first, but the CME is being thrown because preFilterIdToMeterMap is being modified (a new entry added) while it is being iterated over. The iteration is implicit in the foreach call in addAll that's being passed the keySet of preFilterIdToMeterMap. With a test that's hopefully reproducing the same issue as the one reported, we can work on the best solution for it. Unfortunately I'm a bit under the weather today, but if I'm feeling well enough tomorrow I'll continue with this work then.

Code:

@JCStressTest
@Outcome(id = "true", expect = Expect.ACCEPTABLE, desc = "No exception")
@Outcome(expect = Expect.FORBIDDEN, desc = "ConcurrentModificationException thrown")
@State
public static class ConfigureLateMeterFilterWithNewMeterRegister {
    MeterRegistry registry = new SimpleMeterRegistry();

    public ConfigureLateMeterFilterWithNewMeterRegister() {
        // need the registry to not be empty
        registry.counter("c1");
    }

    @Actor
    public void actor1() {
        // creates new meter to add to preMap, updating the modCount
        registry.counter("c2");
    }

    @Actor
    public void actor2(Z_Result r) {
        try {
            // adds all the preMap keys to the staleIds, iterating over preMap keys
            registry.config().commonTags("common2", "tag2");
            r.r1 = true;
        } catch (ConcurrentModificationException e) {
            r.r1 = false;
        }
    }
}

Results:

  Results across all configurations:

  RESULT     SAMPLES     FREQ      EXPECT  DESCRIPTION
   false      57,671    0.19%   Forbidden  ConcurrentModificationException thrown
    true  30,863,033   99.81%  Acceptable  No exception

@mabartos
Copy link
Contributor Author

@shakuzen After applying these changes, the test results are as follows:

[OK] io.micrometer.concurrencytests.MeterRegistryConcurrencyTest.ConfigureLateMeterFilterWithNewMeterRegister

  Results across all configurations:

  RESULT    SAMPLES     FREQ      EXPECT  DESCRIPTION
    true  8,239,104  100.00%  Acceptable  No exception

Let me know if I can help somehow.

Copy link
Member

@shakuzen shakuzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment about one of the changes. I have some polish to make for the concurrency test, and we will want to rebase on the 1.13.x branch so we can release the fix in the next maintenance release, but I can take care of both of these things before merging.

@mabartos
Copy link
Contributor Author

mabartos commented Sep 30, 2024

@shakuzen I updated the code and squashed commits.

Created backport for 1.13.x branch: #5496

Not sure how you handle backports, so let me know if I should manage it differently.

We need to have the new patch 1.13.x version ASAP to include it in Quarkus and other project.

Thanks for your understanding.

PS: It seems CI is broken - not related to this PR.

@brunobat
Copy link

We need to have the new patch 1.13.x version ASAP to include it in Quarkus and other project.

Not sure they can do that...
If there are only fixes in the next patch release we can roll forward and backport to the Quarkus LTS branch.

Fixes micrometer-metrics#5489

Co-authored-by: Tommy Ludwig <8924140+shakuzen@users.noreply.github.com>
Signed-off-by: Martin Bartoš <mabartos@redhat.com>
@shakuzen shakuzen changed the base branch from main to 1.13.x September 30, 2024 09:16
@shakuzen
Copy link
Member

Not sure how you handle backports, so let me know if I should manage it differently.

No worries. We merge to the oldest applicable maintenance branch and then merge forward from there. I'll take care of it from here.

@mabartos
Copy link
Contributor Author

No worries. We merge to the oldest applicable maintenance branch and then merge forward from there. I'll take care of it from here.

@shakuzen Ok, thank you!

@shakuzen shakuzen merged commit c61b1c3 into micrometer-metrics:1.13.x Sep 30, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ConcurrentModificationException when late meter filters are added
5 participants