-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jmx-metrics] Collect in callback #949
Conversation
jmx-metrics/src/main/groovy/io/opentelemetry/contrib/jmxmetrics/GroovyMetricEnvironment.java
Outdated
Show resolved
Hide resolved
// Look at the first mbean collected to evaluate if the attributes requested are | ||
// composite data types or simple. Composite types require different parsing to | ||
// end up with multiple recorders in the same callback. | ||
def bean = mbeans.first() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure it's guaranteed the first will be reflective of the rest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not guaranteed 100%, but it would break metric conventions if some mbeans in a group were composite data and some were not (IE you would have jvm.memory.pool
metrics & the jvm.memory.pool.(committed|used|etc)
metrics if that instrument contained some composite & some simple.
We could add some complexity in order to split mbeans by type, or maybe we just log an error and bail if not all of the mbeans in an instrument match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it's better for the aggregation behavior of the current update implementation to be preserved where it's splitting by per-mbean value.
def value = (CompositeData) mbeans.first().getProperty(attribute)
compositeAttributes.add(new Tuple2<String, Set<String>>(attribute, value.getCompositeType().keySet()))
Here we get the value from the first mbean and then limit the CompositeType* and CompositeData key set for all other mbeans. Given how relaxed the mbean query can be, and how general CompositeData values can be I don't think this is a valid restriction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. As the batch callback needs to register all of its expected instruments, this might be a little complex, will take a pass at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is improved, let me know!
jmx-metrics/src/main/groovy/io/opentelemetry/contrib/jmxmetrics/GroovyMetricEnvironment.java
Outdated
Show resolved
Hide resolved
jmx-metrics/src/main/groovy/io/opentelemetry/contrib/jmxmetrics/GroovyMetricEnvironment.java
Outdated
Show resolved
Hide resolved
…s/GroovyMetricEnvironment.java Co-authored-by: Ryan Fitzpatrick <10867373+rmfitzpatrick@users.noreply.github.com>
Consumer<ObservableDoubleMeasurement> existing = existingUpdater.get(); | ||
existing.accept(doubleResult); | ||
}; | ||
private <T extends ObservableMeasurement> T registerCallback( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This helped a lot with readability.
…ry-java-contrib into collect-in-callback
Hey @dehaansa, I tested this change under the same conditions that was causing the rampant heap usage and looks like this fully resolved the issue. |
} catch (AttributeNotFoundException ignored) { | ||
logger.fine("Attribute ${attribute} not found on any of the collected mbeans") | ||
} catch (InvalidAttributeValueException ignored) { | ||
logger.info("Attribute ${attribute} was not consistently CompositeData for " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may warrant a ~breaking change changelog entry or readme update in the chance that users run into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note to the readme, from what I can see it looks like changelog entries are added outside the scope of the PR introducing the change. If I'm wrong though, I'm happy to add whatever's required for a changelog entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇 thanks!
Co-authored-by: jason plumb <75337021+breedx-splk@users.noreply.github.com>
@trask I see this was tagged for release, anything left that it needs to be merged? |
Description:
Update the jmx metrics gatherer to collect the actual metric values being reported in the callback.
This was initially started to resolve an issue where once a target went down, the same metric values continued to report. The existing code registers its async callback(s) (as many as have been successful collections, so a potential memory leak) to report the value last collected when the groovy script(s) are executed, so when the SDK triggers its
collectAll()
method, those callbacks are executed and the last collected value is reported. Now, on execution of groovy scripts callbacks are registered with the MBeans that currently match the filter in the script, but the actual metric collection happens inside of the callback that is executed when the SDK goes to export.Existing Issue(s): #910
May be relevant to several other issues including memory leaks recently reported & issues with duplicate data errors reporting.
Testing: Local testing against JVM targets. Trying to wrap my head around adding unit tests that would cover this scenario, but as long as the integration tests work that should cover the behavior.
Have validated locally that the following situations work as expected
Documentation:
Should not be a user-facing change that would require documentation. The target of this refactor was to work with the existing JMX scripts without any issues.
Outstanding items: