integrate netobserv to reliability tests #798

memodi · 2024-10-29T21:19:35Z

NETOBSERV-1874 integrate netobserv to reliability tests
dependent on openshift-qe/ocp-qe-perfscale-ci#659

Additionally, it adds new improvements and fixes some bugs:

add trap/cleanup code.
convert dittybopper code into a function and fix a bug to check for performance-dashboard directory.
import dashboards.

openshift-ci · 2024-10-29T21:20:06Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: memodi
Once this PR has been reviewed and has the lgtm label, please assign mffiedler for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

memodi · 2024-10-29T21:22:53Z

/cc @qiliRedHat

reliability-v2/config/reliability-netobserv.yaml

reliability-v2/tasks/Tasks.py

Co-authored-by: Qiujie Li <81630527+qiliRedHat@users.noreply.github.com>

qiliRedHat · 2024-10-31T10:19:34Z

I can add lgtm after openshift-qe/ocp-qe-perfscale-ci#659 is merged and integration test is done

memodi · 2024-10-31T18:41:17Z

reliability-v2/config/reliability-netobserv.yaml

@qiliRedHat do you want this to be separate config? I was thinking to have these tasks under standard reliability tasks and we'd have these checks run always.
Of course that also means we'd have to change netobserv operator would be installed by default for all reliability
runs and optionally exclude it, current implementation in start.sh is inverse, wdyt? this would be the initial idea had discuss so that netobserv could piggyback off your standard reliability runs.

If we have netobserv install by default, we'd need to up standard instance types or somehow have configuration that would accommodate Loki resource requirements, couple of questions here:

Do you typically have infra nodes set up for reliability runs? if so, what are instance types of infra nodes? I can look if you can bring up that environment and I can see if Loki is able to fit on infra nodes.

would you be willing to up the reliability run instance types to m5.2xlarge or m5.4xlarge?

Thanks!

@memodi Reliability test has many test profiles, consider of the cost, I don't want to have noo on all profiles by default. Usually there is no infra nodes in reliability test. But reliability has the option to configure it with -i in start.sh, default size should be same as the worker, size can be configured.
If test it once (7 days) a release, I think we can do m5.2xlarge on one of the profiles.

memodi · 2024-11-08T20:33:19Z

/unhold

qiliRedHat · 2024-11-11T08:38:19Z

reliability-v2/tasks/Tasks.py

+            rc_return = 1
+        elif rc == 1 and result == "":
+            self.logger.info(f"Flowcollector is Ready.")
+            rc_return = 0


@memodi You can add an 'else' to cover the rest of the cases: rc == 1 and result != "", or rc !=1. And log the result to see what you can get.

qiliRedHat · 2024-11-11T08:38:36Z

reliability-v2/tasks/Tasks.py

+                rc_return = 1
+            elif rc == 1 and result == "":
+                self.logger.info(f"Pods in ns ${ns} are healthy.")
+                rc_return = 0


Same comment as above.

integrate netobserv to reliability tests

370e012

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 29, 2024

openshift-ci bot requested review from mehabhalodiya and mffiedler October 29, 2024 21:20

revert formatting

d9929c3

openshift-ci bot requested a review from qiliRedHat October 29, 2024 21:22

qiliRedHat reviewed Oct 30, 2024

View reviewed changes

reliability-v2/config/reliability-netobserv.yaml Outdated Show resolved Hide resolved

reliability-v2/tasks/Tasks.py Outdated Show resolved Hide resolved

reliability-v2/tasks/Tasks.py Outdated Show resolved Hide resolved

memodi and others added 4 commits October 30, 2024 11:25

Apply suggestions from code review

60ce171

Co-authored-by: Qiujie Li <81630527+qiliRedHat@users.noreply.github.com>

review comments

38b34a1

add clean up

e297dec

re-org

12ddc74

memodi commented Oct 31, 2024

View reviewed changes

memodi added 3 commits November 4, 2024 20:40

dashboard

44442e2

add trap to start.sh and update to reliability-netobserv.sh config

967ee96

remove debug repo

b27f657

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 8, 2024

qiliRedHat reviewed Nov 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrate netobserv to reliability tests #798

integrate netobserv to reliability tests #798

memodi commented Oct 29, 2024 •

edited

Loading

openshift-ci bot commented Oct 29, 2024

memodi commented Oct 29, 2024

qiliRedHat commented Oct 31, 2024

memodi Oct 31, 2024

qiliRedHat Nov 1, 2024

memodi commented Nov 8, 2024

qiliRedHat Nov 11, 2024

qiliRedHat Nov 11, 2024

integrate netobserv to reliability tests #798

Are you sure you want to change the base?

integrate netobserv to reliability tests #798

Conversation

memodi commented Oct 29, 2024 • edited Loading

openshift-ci bot commented Oct 29, 2024

memodi commented Oct 29, 2024

qiliRedHat commented Oct 31, 2024

memodi Oct 31, 2024

Choose a reason for hiding this comment

qiliRedHat Nov 1, 2024

Choose a reason for hiding this comment

memodi commented Nov 8, 2024

qiliRedHat Nov 11, 2024

Choose a reason for hiding this comment

qiliRedHat Nov 11, 2024

Choose a reason for hiding this comment

memodi commented Oct 29, 2024 •

edited

Loading