Aggregate task metrics #476

kathy-t · 2023-11-10T20:29:20Z

Description
Corresponding PR: dockstore/dockstore#5738

This PR modifies the metrics aggregator so that it aggregates task execution metrics into a workflow-level execution metric so that it can be aggregated with the other user-provided workflow execution metrics and aggregated metrics.

I did some refactoring to AggregationHelper because it was getting hard to keep track of all the aggregation methods for each metric type and there was a lot of repetitive logic. I introduced an Aggregator interface that defines methods that are common when aggregating each metric type and each metric type now has an aggregator classes that implements this interface. A lot of existing code from AggregationHelper was moved into these classes, so my suggestion is to just focus on the task-related changes - I'll try to highlight them in the files.

How a list of task executions is aggregated into a workflow-level execution for each metric type:

ExecutionStatus:
- If all task executions are successful, then the workflow execution is successful
- If there are failed task executions, which may be either FAILED_RUNTIME_INVALID or FAILED_SEMANTIC_INVALID, the workflow execution status is the most frequent failed status
ExecutionTime:
- This one is trickier and is a best guess because we can't accurately calculate the duration of a workflow execution from durations of task executions. For example: each task took 1 minute to run, but these tasks may have been executed concurrently. We currently have no way of distinguishing this from the durations provided in executionTime.
- This PR calculates a best guess duration using the earliest and latest dateExecuted from the list of tasks.
- Ideally, we would record startTime and endTime so that we can calculate this accurately, but I think it's best to leave that for a follow-up ticket because it may affect existing metrics data that we have.
CpuRequirements:
- The workflow-level CPU requirement is the highest CPU requirement from the list of tasks
MemoryRequirements:
- The workflow-level memory requirement is the highest memory requirement from the list of tasks
Cost:
- The workflow-level cost is the sum of costs from each task

Misc.

There are failed tool-backup tests which I think is because I updated the dockstore version. Anyone have any ideas about how to resolve it?

Review Instructions
Submit task metrics in QA then run the metrics-aggregator. Confirm that the metrics show up in qa.dockstore.org for your workflow.

Issue
SEAB-5944

Security
If there are any concerns that require extra attention from the security team, highlight them here.

Please make sure that you've checked the following before submitting your pull request. Thanks!

Check that you pass the basic style checks and unit tests by running mvn clean install in the project that you have modified (until https://ucsc-cgl.atlassian.net/browse/SEAB-5300 adds multi-module support properly)
Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
If you are changing dependencies, check with dependabot to ensure you are not introducing new high/critical vulnerabilities
If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

kathy-t · 2023-11-10T20:30:52Z

metricsaggregator/src/main/java/io/dockstore/metricsaggregator/helper/CostAggregator.java

+    }
+
+    @Override
+    public Optional<RunExecution> getWorkflowExecutionFromTaskExecutions(TaskExecutions taskExecutionsForOneWorkflowRun) {


This is the new function that aggregates tasks into a workflow execution for the cost metric

kathy-t · 2023-11-10T20:31:15Z

metricsaggregator/src/main/java/io/dockstore/metricsaggregator/helper/CpuAggregator.java

+    }
+
+    @Override
+    public Optional<RunExecution> getWorkflowExecutionFromTaskExecutions(TaskExecutions taskExecutionsForOneWorkflowRun) {


This is the new function that aggregates tasks into a workflow execution for the CPU metric

kathy-t · 2023-11-10T20:31:26Z

...ggregator/src/main/java/io/dockstore/metricsaggregator/helper/ExecutionStatusAggregator.java

+    }
+
+    @Override
+    public Optional<RunExecution> getWorkflowExecutionFromTaskExecutions(TaskExecutions taskExecutionsForOneWorkflowRun) {


This is the new function that aggregates tasks into a workflow execution for the execution status metric

kathy-t · 2023-11-10T20:31:40Z

...saggregator/src/main/java/io/dockstore/metricsaggregator/helper/ExecutionTimeAggregator.java

+    }
+
+    @Override
+    public Optional<RunExecution> getWorkflowExecutionFromTaskExecutions(TaskExecutions taskExecutionsForOneWorkflowRun) {


This is the new function that aggregates tasks into a workflow execution for the execution time metric

kathy-t · 2023-11-10T20:31:54Z

metricsaggregator/src/main/java/io/dockstore/metricsaggregator/helper/MemoryAggregator.java

+    }
+
+    @Override
+    public Optional<RunExecution> getWorkflowExecutionFromTaskExecutions(TaskExecutions taskExecutionsForOneWorkflowRun) {


This is the new function that aggregates tasks into a workflow execution for the memory metric

kathy-t · 2023-11-10T20:35:11Z

...csaggregator/src/main/java/io/dockstore/metricsaggregator/helper/RunExecutionAggregator.java

+        final List<RunExecution> workflowExecutions = new ArrayList<>(allSubmissions.getRunExecutions());
+
+        // If task executions are present, calculate the workflow RunExecution containing the overall workflow-level execution time for each list of tasks
+        if (!allSubmissions.getTaskExecutions().isEmpty()) {
+            final List<RunExecution> calculatedWorkflowExecutionsFromTasks = allSubmissions.getTaskExecutions().stream()
+                    .map(taskExecutions -> getWorkflowExecutionFromTaskExecutions(taskExecutions))
+                    .filter(Optional::isPresent)
+                    .map(Optional::get)
+                    .toList();
+            workflowExecutions.addAll(calculatedWorkflowExecutionsFromTasks);
+        }


This is the change in the main aggregation function that aggregates task executions into workflow executions. The rest of the code in this function is existing logic that was moved over from each individual getAggregated<Metric> function in AggregationHelper

kathy-t · 2023-11-10T20:36:15Z

pom.xml

@@ -38,7 +38,7 @@

        <github.url>scm:git:git@github.com:dockstore/dockstore-support.git</github.url>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
-        <dockstore-core.version>1.15.0-alpha.5</dockstore-core.version>
+        <dockstore-core.version>1.15.0-SNAPSHOT</dockstore-core.version>


I will update this to a tag when dockstore/dockstore#5738 is merged

codecov · 2023-11-10T20:39:48Z

Codecov Report

Attention: 29 lines in your changes are missing coverage. Please review.

Comparison is base (ff68caf) 42.42% compared to head (1bd65af) 52.79%.

Files	Patch %	Lines
...store/metricsaggregator/helper/CostAggregator.java	82.22%	2 Missing and 6 partials ⚠️
...ricsaggregator/helper/ExecutionTimeAggregator.java	86.53%	1 Missing and 6 partials ⚠️
...csaggregator/helper/ExecutionStatusAggregator.java	81.25%	2 Missing and 4 partials ⚠️
...kstore/metricsaggregator/helper/CpuAggregator.java	87.87%	1 Missing and 3 partials ⚠️
...ore/metricsaggregator/helper/MemoryAggregator.java	87.09%	1 Missing and 3 partials ⚠️

Additional details and impacted files

@@              Coverage Diff               @@
##             develop     #476       +/-   ##
==============================================
+ Coverage      42.42%   52.79%   +10.36%     
- Complexity       171      248       +77     
==============================================
  Files             24       30        +6     
  Lines           1591     1665       +74     
  Branches         131      141       +10     
==============================================
+ Hits             675      879      +204     
+ Misses           878      721      -157     
- Partials          38       65       +27

Flag	Coverage Δ
metricsaggregator	`43.42% <87.16%> (+0.99%)`	⬆️
toolbackup	`42.76% <83.62%> (+10.64%)`	⬆️
tooltester	`33.39% <83.62%> (+1.27%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coverbeck

There are failed tool-backup tests which I think is because I updated the dockstore version. Anyone have any ideas about how to resolve it?

Not off-hand, I'm not familiar with the code. I'd be OK with disabling it for now, with a followup ticket. Or Denis should be back soon and should know more about it.

svonworl · 2023-11-14T20:49:09Z

...saggregator/src/main/java/io/dockstore/metricsaggregator/helper/ExecutionTimeAggregator.java

+                        .max(Date::compareTo);
+
+                if (earliestTaskExecutionDate.isPresent() && latestTaskExecutionDate.isPresent()) {
+                    long durationInMs = latestTaskExecutionDate.get().getTime() - earliestTaskExecutionDate.get().getTime();


If we're trying to calculate entry-level "wall clock" execution time, assuming the task executionTime is also wall clock time, and the execution dates are defined as the start time of task execution, you could compute a lastTaskEndedDate by incorporating the task's executionTime (if it exists), and from that, a more accurate duration as:

long durationInMs = latestTaskEndedDate.get().getTime() - earliestTaskExecutionDate.get().getTime();

Another idea is to compute "wall clock" duration estimates via several different methods: maximum of all execution times, duration as difference between min/max execution dates, etc. If we know that each method produces a lower bound on the duration, take the max at the end.

I implemented your first suggesting by adding the executionTime of the last task to the calculated wall clock execution time

svonworl · 2023-11-14T21:01:12Z

metricsaggregator/src/main/java/io/dockstore/metricsaggregator/helper/MemoryAggregator.java

+                    .minimum(newStatistic.getMinimum())
+                    .maximum(newStatistic.getMaximum())
+                    .average(newStatistic.getAverage())
+                    .numberOfDataPointsForAverage(newStatistic.getNumberOfDataPoints()));


This code repeats a lot, possibly we could create a function that does the same thing in the base Metric, so we could shorten to something like:

return Optional.of(new MemoryMetric().set(newStatistic))?

Attempted to create a helper function in Statistics that would do this, but it requires that the StatisticMetric subclasses (CpuMetric, MemoryMetric, ExecutionTimeMetric, CostMetric) define their inheritance structure in the openapi.yaml. I took a crack at this and it resulted in some weird side effects. It felt a little risky pursuing this any further since we're so close to the release of 1.15 and I didn't want to accidentally break anything.

Perhaps it can be an enhancement for 1.16

…tric" This reverts commit bda63d9.

kathy-t · 2023-11-20T16:28:29Z

toolbackup/pom.xml

These were the changes that were needed to get the toolbackup tests to pass, FYI @denis-yuen

denis-yuen · 2023-11-20T16:49:27Z

SonarCloud Quality Gate failed.

Think these are trivial but worth it if going in for something else

denis-yuen · 2023-11-20T16:51:40Z

metricsaggregator/src/main/java/io/dockstore/metricsaggregator/helper/CostAggregator.java

+            List<Cost> taskCosts = taskExecutions.stream()
+                    .map(RunExecution::getCost)
+                    .toList();
+            boolean containsMalformedCurrencies = taskCosts.stream().anyMatch(cost -> !isValidCurrencyCode(cost.getCurrency()));


Not an issue yet since we don't have examples of cost data yet. But could be follow-up issue, would it be an improvement not to skip the total cost if I understand properly if there are any malformed currencies but to provide a total of the properly formed currencies with an asterisk or warning?

On the other hand, it might be ok if this is just ignoring one workflow run with malformed currencies and not other runs. (which I think is the case)

So maybe, just confirm my understanding.

would it be an improvement not to skip the total cost if I understand properly if there are any malformed currencies but to provide a total of the properly formed currencies with an asterisk or warning?

Hmm, in this function, I'm not sure if it would be worth it since these tasks make up one workflow execution.

On the other hand, it might be ok if this is just ignoring one workflow run with malformed currencies and not other runs. (which I think is the case)

In this function where it aggregates a list of task executions metrics belonging to a single workflow execution into one workflow-level execution metrics, your understanding is correct. It will just ignore the one set of task executions beloning to one workflow run.

However, in the function below where we get the aggregated metric from workflow executions, we don't aggregate the metric if one of the executions is malformed. This was an existing pattern prior to this PR and also applies to the execution time metric. It sounds like we should instead ignore the one malformed workflow run and aggregate the others?

Created a follow-up ticket https://ucsc-cgl.atlassian.net/browse/SEAB-6046

denis-yuen · 2023-11-20T16:55:44Z

...saggregator/src/main/java/io/dockstore/metricsaggregator/helper/ExecutionTimeAggregator.java

+                if (earliestTaskExecutionDate.isPresent() && latestTaskExecutionDate.isPresent() && latestTaskExecuted.isPresent()) {
+                    // Execution dates are the start dates, calculate a rough duration from the execution dates of the earliest and latest tasks
+                    long durationInMs = latestTaskExecutionDate.get().getTime() - earliestTaskExecutionDate.get().getTime();
+                    Duration duration = Duration.of(durationInMs, ChronoUnit.MILLIS);


FWIW, I can't imagine people would care much about anything more granular than minutes. Assuming the most granular billing is per minute.

That said, this probably gets converted for display anyway

denis-yuen · 2023-11-20T16:57:57Z

toolbackup/pom.xml

@@ -228,6 +229,11 @@
            <groupId>org.glassfish.hk2</groupId>
            <artifactId>hk2-api</artifactId>
        </dependency>
+        <dependency>


Think this is one of the artifacts that is removed in Java 11
https://stackoverflow.com/questions/52502189/java-11-package-javax-xml-bind-does-not-exist

I think a follow-up ticket should update this to jakarta but probably not a high priority.

Created https://ucsc-cgl.atlassian.net/browse/SEAB-6047

sonarcloud · 2023-11-21T15:47:49Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
0 Code Smells

0.0% Coverage
0.0% Duplication

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

kathy-t added 5 commits November 9, 2023 11:25

Calculate workflow executions from task executions

19f3ef7

Create an interface for aggregating and refactor

4b937ea

Remove unused code

e7716d9

Add comments and tests

c32b7e4

Add tests for each aggregator

572319a

kathy-t self-assigned this Nov 10, 2023

kathy-t mentioned this pull request Nov 10, 2023

Allow the submission of task metrics dockstore/dockstore#5738

Merged

9 tasks

kathy-t commented Nov 10, 2023

View reviewed changes

Add integration test

dfe495b

kathy-t marked this pull request as ready for review November 10, 2023 21:34

kathy-t requested review from coverbeck, denis-yuen, david4096, hyunnaye and svonworl November 10, 2023 21:35

coverbeck approved these changes Nov 14, 2023

View reviewed changes

hyunnaye approved these changes Nov 14, 2023

View reviewed changes

svonworl approved these changes Nov 14, 2023

View reviewed changes

kathy-t added 4 commits November 16, 2023 09:42

Create a helper method that sets the values for a StatisticMetric

bda63d9

Revert "Create a helper method that sets the values for a StatisticMe…

6ae087b

…tric" This reverts commit bda63d9.

Add last task duration to estimated executiontime calculation

6888fa8

Merge branch 'develop' into feature/seab-5944/aggregate-task-metrics

d80be93

david4096 approved these changes Nov 16, 2023

View reviewed changes

kathy-t requested a review from svonworl November 16, 2023 22:01

svonworl approved these changes Nov 17, 2023

View reviewed changes

Fix toolbackup tests

fa7e926

kathy-t commented Nov 20, 2023

View reviewed changes

denis-yuen approved these changes Nov 20, 2023

View reviewed changes

Update webservice tag and fix sonarcloud warnings

1bd65af

kathy-t merged commit f23a0a5 into develop Nov 21, 2023
10 of 11 checks passed

kathy-t deleted the feature/seab-5944/aggregate-task-metrics branch November 21, 2023 15:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregate task metrics #476

Aggregate task metrics #476

kathy-t commented Nov 10, 2023 •

edited

Loading

kathy-t Nov 10, 2023

kathy-t Nov 10, 2023

kathy-t Nov 10, 2023

kathy-t Nov 10, 2023

kathy-t Nov 10, 2023

kathy-t Nov 10, 2023

kathy-t Nov 10, 2023

codecov bot commented Nov 10, 2023 •

edited

Loading

coverbeck left a comment

svonworl Nov 14, 2023

svonworl Nov 14, 2023

kathy-t Nov 16, 2023

svonworl Nov 14, 2023 •

edited

Loading

kathy-t Nov 16, 2023

kathy-t Nov 20, 2023

denis-yuen commented Nov 20, 2023

denis-yuen Nov 20, 2023

denis-yuen Nov 20, 2023

kathy-t Nov 20, 2023

kathy-t Nov 20, 2023

denis-yuen Nov 20, 2023

denis-yuen Nov 20, 2023

kathy-t Nov 21, 2023

sonarcloud bot commented Nov 21, 2023

Aggregate task metrics #476

Aggregate task metrics #476

Conversation

kathy-t commented Nov 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Nov 10, 2023 • edited Loading

Codecov Report

coverbeck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svonworl Nov 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

denis-yuen commented Nov 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonarcloud bot commented Nov 21, 2023

kathy-t commented Nov 10, 2023 •

edited

Loading

codecov bot commented Nov 10, 2023 •

edited

Loading

svonworl Nov 14, 2023 •

edited

Loading