Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suggested tweaks to observability setup #475

Merged
merged 4 commits into from
Jan 29, 2024

Conversation

davidangb
Copy link
Collaborator

Note this is a PR to #461, not to main

See my comments inline for explanation.

ObservationRegistry testObservationRegistry() {
return TestObservationRegistry.create();
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces the default ObservationRegistry bean with a TestObservationRegistry inside this test, instead of keeping the default ObservationRegistry bean and adding an additional (non-bean) TestObservationRegistry.

@@ -64,16 +78,14 @@ void beforeAll() {
.thenThrow(new RuntimeException("test failed via jobDao.fail()"));
when(jobDao.fail(any(), anyString()))
.thenThrow(new RuntimeException("test failed via jobDao.fail()"));
// set up metrics registries
testObservationRegistry = TestObservationRegistry.create();
Metrics.globalRegistry.add(meterRegistry);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this setup is not necessary; Spring handles it for us once we make TestObservationRegistry a Spring-managed bean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad that creating a TestObservationRegistry bean does get Metrics to work (it didn't in my implementation, hence why I had to create a test and a regular observation registry)

meterRegistry.clear();
Metrics.globalRegistry.clear();
((TestObservationRegistry) observationRegistry).clear();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously, this cleared everything after all tests had run. There's no point to that, as Spring will destroy the context after the whole test finishes anyway.

Instead, I am now clearing everything before each test; this prevents metrics recorded in one test bleeding into the metrics being read by another test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the beforeeach method on thursday but i think I was missing the TestObservationRegistry bean as the missing piece, thank you

// since the job we just ran is complete, we expect it to show zero active tasks
// and zero duration.
assertEquals(0, longTaskTimer.activeTasks());
assertEquals(0, longTaskTimer.duration(TimeUnit.SECONDS));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously, this test asserted that longTaskTimer.duration was nonzero. The fact it was nonzero seems to be an artifact of recording the same observation twice to the same meterRegistry. Since this is the "*active" metric, it should only be recording currently-in-flight observations, and we should expect it to be zero. Once I cleared up the double-recording, it is now zero.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why we had a count of 2 after executing a job once. Spring already connects the ObservationRegistry to the MeterRegistry. But here, we were connecting it a second time … and thus, we were recording each observation twice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang, introducing this config was my suggestion...I didn't realize Spring was already providing this exact thing out of the box.

I think this perhaps remains a valuable reference snippet for how we might want to customize the observation config when we have a reason to do so, so I wanted to ask to check my own comprehension:

If we just leave the Config class in place and only delete line 23 (.observationHandler(...)), would it have the same/desired effect as dropping the whole file? I imagine wanting to re-introduce this class at some point for the purpose of overriding config, but don't want to cause a regression.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tricky!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jladieu my $.02 would be to look back at this PR, or the PR that introduced this code, if we need a reference in the future instead of keeping it in the runtime codebase but doing nothing. That said, yeah, I think just removing the .observationHandler(...) line would have the same effect.

Copy link
Contributor

@jladieu jladieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice sleuthing!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang, introducing this config was my suggestion...I didn't realize Spring was already providing this exact thing out of the box.

I think this perhaps remains a valuable reference snippet for how we might want to customize the observation config when we have a reason to do so, so I wanted to ask to check my own comprehension:

If we just leave the Config class in place and only delete line 23 (.observationHandler(...)), would it have the same/desired effect as dropping the whole file? I imagine wanting to re-introduce this class at some point for the purpose of overriding config, but don't want to cause a regression.


// override Spring's standard ObservationRegistry with a TestObservationRegistry
@TestConfiguration
static class TestObservationConfig {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider making this config universal for all tests? If that's possible, it might be nice to provide an accessor method for the global TestObservationRegistry that tests can use instead of having to @Autowire ObservationRegistry and typecast anywhere that's needed. The typecasting feels clumsy and it'd be nice if it wasn't necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately making this config universal for all tests isn't super easy. I've moved the static inner class to a reusable standalone class … but it still requires any test that needs it to include it via @Import and to perform the appropriate casting.

fwiw, I kinda like the inclusion of

    // this test requires a TestObservationRegistry
    TestObservationRegistry testObservationRegistry =
        assertInstanceOf(TestObservationRegistry.class, observationRegistry);

in the unit test, as it ensures the unit test is set up correctly.

Copy link
Contributor

@yuliadub yuliadub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for digging into this David - and the easy explanation of what you found through all the comments.

Copy link

sonarcloud bot commented Jan 26, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@davidangb
Copy link
Collaborator Author

@ashanhol feel free to merge this at will; I didn't want to merge directly into your branch myself.

return observationRegistry ->
observationRegistry
.observationConfig()
.observationHandler(new DefaultMeterObservationHandler(meterRegistry));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm a little confused. Without explicitly defining the DefaultMeterObservationHandler in the normal ObservationConfig, i noticed the timers were coming out 0 in my manual testing pulling the prometheus dump, even though theoretically a default one is supposed to be created. Was that not a problem for you?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the timers were coming out 0

which timer, specifically? My understanding is that the wds.job.execute timer tracks historical data and persists timing info after the job is done. But, wds.job.execute.active - note the "active" suffix - is only supposed to track jobs that are currently in flight. So, the wds.job.execute.active timer should be zero unless you catch it at exactly the right time, when a job has started but not yet completed.

When I ran manually via bootRun, I saw the expected results in the prometheus scrape API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah shoot it was active seconds so yeah I went out of my way to give wrong results. I had a really hard time finding definitions for all the metrics (timers, max, etc), how did you find clarification?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the definition when reading the javadoc/comments for LongTaskTimer. I was clicking around in IntelliJ but you can see the comments here, too: https://www.javadoc.io/doc/io.micrometer/micrometer-core/latest/io/micrometer/core/instrument/LongTaskTimer.html

A long task timer is used to track the total duration of all in-flight long-running tasks and the number of such tasks.

"in-flight" was the key description.

@ashanhol ashanhol merged commit 5066e04 into adinas/AsyncObservability Jan 29, 2024
8 of 9 checks passed
@ashanhol ashanhol deleted the da_observabilityTweaks branch January 29, 2024 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants