Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][WIP] Scrape metrics at the end of e2e test #6330

Open
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

chahatsagarmain
Copy link
Contributor

@chahatsagarmain chahatsagarmain commented Dec 9, 2024

Which problem is this PR solving?

Description of the changes

  • scrape script and usage in cit workflow
  • diff calculating script
  • cache save and restore from main workflow runs
  • A sample diff (txt)
--- 
+++ 
@@ -303,2 +303,2 @@
-rpc_server_requests_per_rpc{le="+Inf",rpc_grpc_status_code="0",rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TestService",rpc_system="grpc",service_name="jaeger",service_version=""}
-rpc_server_requests_per_rpc{le="0",rpc_grpc_status_code="0",rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="jaeger",service_version=""}
+rpc_server_requests_per_rpc{le="+Inf",rpc_grpc_status_code="0",rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="jaeger",service_version=""}
+rpc_server_requests_per_rpc{le="0",rpc_grpc_status_code="1",rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="jaeger",service_version=""}

@@ -321 +321 @@
-rpc_server_response_size{le="+Inf",rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="jaeger",service_version=""}
+rpc_server_response_size{le="+Inf",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="jaeger",service_version="",test_change="Export"}

@@ -338 +338 @@
-rpc_server_response_size{rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="jaeger",service_version=""}
+rpc_server_response_size{rpc_method="Export",rpc_service="opentelemetry.proto.collector.trace.v1.TraceService",rpc_system="grpc",service_name="test-jaeger",service_version=""}

How was this change tested?

Checklist

Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
@chahatsagarmain chahatsagarmain requested a review from a team as a code owner December 9, 2024 15:46
Copy link

codecov bot commented Dec 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.29%. Comparing base (980dc31) to head (41693c5).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6330      +/-   ##
==========================================
+ Coverage   96.27%   96.29%   +0.01%     
==========================================
  Files         372      372              
  Lines       21283    21278       -5     
==========================================
- Hits        20491    20489       -2     
+ Misses        605      603       -2     
+ Partials      187      186       -1     
Flag Coverage Δ
badger_v1 10.68% <ø> (+<0.01%) ⬆️
badger_v2 2.78% <ø> (+<0.01%) ⬆️
cassandra-4.x-v1-manual 16.58% <ø> (+<0.01%) ⬆️
cassandra-4.x-v2-auto 2.71% <ø> (+<0.01%) ⬆️
cassandra-4.x-v2-manual 2.71% <ø> (+<0.01%) ⬆️
cassandra-5.x-v1-manual 16.58% <ø> (+<0.01%) ⬆️
cassandra-5.x-v2-auto 2.71% <ø> (+<0.01%) ⬆️
cassandra-5.x-v2-manual 2.71% <ø> (+<0.01%) ⬆️
elasticsearch-6.x-v1 20.25% <ø> (+0.01%) ⬆️
elasticsearch-7.x-v1 20.33% <ø> (+0.01%) ⬆️
elasticsearch-8.x-v1 20.48% <ø> (+0.01%) ⬆️
elasticsearch-8.x-v2 2.77% <ø> (+<0.01%) ⬆️
grpc_v1 12.20% <ø> (-0.12%) ⬇️
grpc_v2 9.03% <ø> (-0.07%) ⬇️
kafka-3.x-v1 10.36% <ø> (+<0.01%) ⬆️
kafka-3.x-v2 2.78% <ø> (+<0.01%) ⬆️
memory_v2 2.77% <ø> (-0.01%) ⬇️
opensearch-1.x-v1 20.37% <ø> (+0.01%) ⬆️
opensearch-2.x-v1 20.37% <ø> (+<0.01%) ⬆️
opensearch-2.x-v2 2.77% <ø> (-0.01%) ⬇️
tailsampling-processor 0.51% <ø> (+<0.01%) ⬆️
unittests 95.16% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
@chahatsagarmain chahatsagarmain marked this pull request as draft December 9, 2024 16:25
@chahatsagarmain chahatsagarmain changed the title [CI] Scrape metrics at the end of e2e test [CI][WIP] Scrape metrics at the end of e2e test Dec 9, 2024
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
@chahatsagarmain chahatsagarmain marked this pull request as ready for review December 14, 2024 00:00
cmd/jaeger/internal/integration/e2e_integration.go Outdated Show resolved Hide resolved
.github/workflows/ci-e2e-all.yml Outdated Show resolved Hide resolved
scripts/es-integration-test.sh Outdated Show resolved Hide resolved
.github/workflows/ci-e2e-cassandra.yml Outdated Show resolved Hide resolved
.github/workflows/ci-e2e-cassandra.yml Show resolved Hide resolved
.github/workflows/ci-e2e-cassandra.yml Outdated Show resolved Hide resolved
.github/workflows/ci-e2e-all.yml Show resolved Hide resolved
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
.gitignore Outdated Show resolved Hide resolved
chahatsagarmain and others added 4 commits December 29, 2024 01:48
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahat sagar <109112505+chahatsagarmain@users.noreply.github.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
if: ${{ (github.ref_name == 'main') && (steps.check-cache.outputs.cache-hit != 'true') }}
shell: bash
run: |
mv ./.metrics/${{ inputs.snapshot }} ./.metrics/cached_${{ inputs.snapshot}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?

with:
path: ./.metrics/cached_${{ inputs.snapshot}}
key: ${{ env.v2_tag }}_${{ inputs.artifact_key }}
lookup-only: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why lookup-only? Is the step going to hard fail otherwise?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the purpose of this step. We already have download-release-snapshot step below.

if: ${{ (github.ref_name == 'main') && (steps.check-cache.outputs.cache-hit != 'true') }}
uses: actions/cache/save@1bd1e32a3bdc45362d1e726936510720a7c30a57
with:
path: ./.metrics/cached_${{ inputs.snapshot}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to refer to file name with cached_ prefix? The name is meaningless after upload, only artifact.key matters.

Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
- name: Cache scraped metrics for tagged release for longer retention
id: tagged-metrics
if: ${{ (github.ref_name == 'main') && (steps.check-cache.outputs.cache-hit != 'true') }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if: ${{ (github.ref_name == 'main') && (steps.check-cache.outputs.cache-hit != 'true') }}
if: ${{ (github.ref_name == 'main') }}

Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
chahatsagarmain and others added 3 commits January 10, 2025 00:59
Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
… into e2e-metrics

Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>
@chahatsagarmain
Copy link
Contributor Author

chahatsagarmain commented Jan 11, 2025

@yurishkuro Looks like github actions cache does not allow overwriting cache with the same key
-> We can now only allow to upload cache once , at the first worflow after release
-> Or add an extra step to delete that specific cache and re upload

Signed-off-by: chahatsagarmain <chahatsagar2003@gmail.com>

- name: Calculate diff between the snapshots
id: compare-snapshots
if: steps.download-release-snapshot.outputs.cache-hit == 'true'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this runs on main there wouldn't be a file named ./.metrics/${{ inputs.snapshot }} because it would've been moved two steps above

@yurishkuro
Copy link
Member

Looks like github actions cache does not allow overwriting cache with the same key
(1) We can now only allow to upload cache once , at the first worflow after release
(2) Or add an extra step to delete that specific cache and re upload

Option (1) is not good because we'd be always comparing to a month-old snapshot, so if several PRs since then make expected changes to the metrics then the diffs will be cumulative, which is not helpful.

Option (2) seems reasonable if we can do it. Isn't there some override flag in the cache/upload action?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog:ci Change related to continuous integration / testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants