Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record number of skipped executions during aggregation #486

Merged
merged 11 commits into from
Mar 8, 2024

Conversation

kathy-t
Copy link
Contributor

@kathy-t kathy-t commented Feb 26, 2024

Description
Corresponding PRs:

This PR modifies the metrics aggregator so that it aggregates workflow executions even when there are invalid executions (it previously skipped aggregating the metric entirely) and records the number of executions that were skipped. It validates executions using the validation constraints defined in the webservice metrics objects.

I did some minor refactoring that involved changing the interfaces to abstract classes to remove unnecessary task methods from the Validation Status aggregator which don't have the concept of task metrics.

Review Instructions
See dockstore/dockstore-ui2#1938

Issue
SEAB-6046

Security
If there are any concerns that require extra attention from the security team, highlight them here.

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install in the project that you have modified (until https://ucsc-cgl.atlassian.net/browse/SEAB-5300 adds multi-module support properly)
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • If you are changing dependencies, check with dependabot to ensure you are not introducing new high/critical vulnerabilities
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

@kathy-t kathy-t self-assigned this Feb 26, 2024
Copy link

codecov bot commented Feb 26, 2024

Codecov Report

Attention: Patch coverage is 87.85047% with 13 lines in your changes are missing coverage. Please review.

Project coverage is 56.02%. Comparing base (64ddd19) to head (6696cc6).

Files Patch % Lines
...store/metricsaggregator/helper/CostAggregator.java 72.72% 1 Missing and 2 partials ⚠️
...csaggregator/helper/ExecutionStatusAggregator.java 78.57% 1 Missing and 2 partials ⚠️
...saggregator/helper/ValidationStatusAggregator.java 62.50% 2 Missing and 1 partial ⚠️
...ore/metricsaggregator/helper/MemoryAggregator.java 77.77% 1 Missing and 1 partial ⚠️
...kstore/metricsaggregator/helper/CpuAggregator.java 87.50% 1 Missing ⚠️
...ricsaggregator/helper/ExecutionTimeAggregator.java 90.00% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop     #486      +/-   ##
=============================================
- Coverage      56.41%   56.02%   -0.40%     
- Complexity       295      297       +2     
=============================================
  Files             32       33       +1     
  Lines           1964     2001      +37     
  Branches         166      163       -3     
=============================================
+ Hits            1108     1121      +13     
- Misses           779      796      +17     
- Partials          77       84       +7     
Flag Coverage Δ
metricsaggregator 48.22% <87.85%> (-0.25%) ⬇️
toolbackup 30.73% <67.28%> (+0.03%) ⬆️
tooltester 22.93% <67.28%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

return executions.stream()
.map(this::getMetricFromExecution)

public final Optional<M> getAggregatedMetricFromExecutions(List<T> executions) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method that calculates the aggregated metric plus sets the number of executions skipped.

* @param aggregatedMetrics
* @return
*/
public final Optional<M> getAggregatedMetricsFromAggregatedMetrics(List<M> aggregatedMetrics) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method that calculates the aggregated metric plus sets the number of executions skipped.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's calculating a single aggregated metric, should it be named getAggregatedMetricFromAggregatedMetrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an instance where it calculates multiple aggregated metrics into a single aggregated metric:

// Calculate metrics across all platforms by aggregating the aggregated metrics from each platform
try {
getAggregatedMetrics(allMetrics).ifPresent(metrics -> {

* @param metrics
* @return
*/
M getMetricFromMetrics(Metrics metrics);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was moved into ExecutionAggregator

* @param metricsList
* @return
*/
default Optional<M> getAggregatedMetricFromMetricsList(List<Metrics> metricsList) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was moved to ExecutionAggregator

* @return
*/
List<T> getExecutionsFromExecutionRequestBody(ExecutionsRequestBody executionsRequestBody);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was moved into ExecutionAggregator

* @param allSubmissions
* @return
*/
public Optional<M> getAggregatedMetricFromAllSubmissions(ExecutionsRequestBody allSubmissions) {
Copy link
Contributor Author

@kathy-t kathy-t Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is like the getAggregatedMetricFromAllSubmissions default method that was in the original ExecutionsRequestBodyAggregator interface except it completely removes the invokation of the task methods since tasks are not applicable to validation executions

pom.xml Outdated
<changelist>.0-SNAPSHOT</changelist>

<github.url>scm:git:git@github.com:dockstore/dockstore-support.git</github.url>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<dockstore-core.version>1.15.0-rc.2</dockstore-core.version>
<dockstore-core.version>1.16.0-SNAPSHOT</dockstore-core.version>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update this to a tag once the webservice PR is merged

@kathy-t kathy-t marked this pull request as ready for review February 29, 2024 14:54
*/
public boolean isValid(T execution) {
Set<ConstraintViolation<T>> violations = validator.validate(execution);
return violations.stream().noneMatch(violation -> violation.getPropertyPath().toString().startsWith(getPropertyPathToValidate()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the string-based comparison vulnerable to problems like getPropertyPath().toString() being foobar and getPropertyPathToValidate() being foo? Violation has a getNode method containing the name, I think, might be safer use that. In practice, probably doesn't matter tho.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's possible, but I intentionally used the startsWith comparison because for a Cost value violation, the property path is cost.value, not just cost. I could define the CostAggregator's getPropertyPathToValidate() as cost.value, but we might allow users to provide their own currency in the future so I figured it would be best to be generic and use startsWith

* @param aggregatedMetrics
* @return
*/
public final Optional<M> getAggregatedMetricsFromAggregatedMetrics(List<M> aggregatedMetrics) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's calculating a single aggregated metric, should it be named getAggregatedMetricFromAggregatedMetrics?

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential follow-up ticket, but before that (and the aggregator getting much more sophisticated, we should do some AWS Athena prototyping)

@@ -297,6 +301,7 @@
<usedDependency>ch.qos.logback:logback-classic</usedDependency>
<usedDependency>ch.qos.logback:logback-core</usedDependency>
<usedDependency>com.google.guava:guava</usedDependency>
<usedDependency>jakarta.validation:jakarta.validation-api</usedDependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little weird since you have

import jakarta.validation.ConstraintViolation;
import jakarta.validation.Validation;
import jakarta.validation.Validator;

pretty straightforward in the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't know why the build doesn't recognize that the dependency is being used 🤔

@@ -70,4 +71,10 @@ void testGetAggregatedMetricFromWorkflowExecutions() {
assertEquals(4.0, cpuMetric.get().getAverage());
assertEquals(3, cpuMetric.get().getNumberOfDataPointsForAverage());
}

private RunExecution createRunExecution(Integer cpuRequirements) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda liked the old way, could use a proper builder object design pattern though.
Could be followup ticket.

costMetric = executionStatusMetric.get().getCount().get(SUCCESSFUL.name()).getCost();
assertNotNull(costMetric);
assertEquals(2.0, costMetric.getMinimum());
assertEquals(6.0, costMetric.getMaximum()); // The max is the cost of the two tasks summed together
assertEquals(4.0, costMetric.getAverage());
assertEquals(2, costMetric.getNumberOfDataPointsForAverage()); // Two data points: 1 from the workflow execution and 1 for the list of tasks
}

private RunExecution createRunExecutionWithExecutionTime(ExecutionStatus executionStatus, String executionTime) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, I think this would be cleaner with a builder

@kathy-t
Copy link
Contributor Author

kathy-t commented Mar 7, 2024

(build will fail until I update the webservice version to a tag)

@kathy-t kathy-t requested a review from svonworl March 7, 2024 16:41
Copy link

sonarcloud bot commented Mar 7, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

@kathy-t kathy-t merged commit c751c2e into develop Mar 8, 2024
9 of 10 checks passed
@kathy-t kathy-t deleted the feature/seab-6046/aggregate-even-if-invalid branch March 8, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants