Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/prometheusremotewriteexporter] Retry on 5xx status codes using cenkalti/backoff client #23842

Merged
merged 26 commits into from
Aug 29, 2023

Conversation

vasireddy99
Copy link
Contributor

@vasireddy99 vasireddy99 commented Jun 29, 2023

Description: Fix the Retry on 5xx status code using cenkalti/backoff package and added unit test.

Thanks to @rapphil for proposing the solution.

Link to tracking Issue: #20304

Testing: Added unit tests for retryOn5xx and noRetryOn4xx

Prometheus compliance test results -

--- PASS: TestRemoteWrite (20.10s)
    --- PASS: TestRemoteWrite/otelcollector (0.01s)
        --- PASS: TestRemoteWrite/otelcollector/NameLabel (10.05s)
        --- PASS: TestRemoteWrite/otelcollector/Invalid (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/RepeatedLabels (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/Retries400 (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/SortedLabels (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/Up (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/JobLabel (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/HonorLabels (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/EmptyLabels (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/Counter (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/InstanceLabel (10.06s)
        --- PASS: TestRemoteWrite/otelcollector/Ordering (17.10s)
        --- PASS: TestRemoteWrite/otelcollector/Histogram (10.02s)
        --- PASS: TestRemoteWrite/otelcollector/Gauge (10.02s)
        --- PASS: TestRemoteWrite/otelcollector/Headers (10.02s)
        --- PASS: TestRemoteWrite/otelcollector/Retries500 (10.03s)
        --- PASS: TestRemoteWrite/otelcollector/Staleness (10.03s)
        --- PASS: TestRemoteWrite/otelcollector/Summary (10.03s)
        --- PASS: TestRemoteWrite/otelcollector/Timestamp (10.03s)
PASS
ok      github.com/prometheus/compliance/remote_write   20.368s

@vasireddy99 vasireddy99 changed the title Prw5xx [exporter/prometheusremotewrite] Retry on 5xx status codes using go-retryablehttp client Jul 3, 2023
@vasireddy99 vasireddy99 changed the title [exporter/prometheusremotewrite] Retry on 5xx status codes using go-retryablehttp client [exporter/prometheusremotewriteexporter] Retry on 5xx status codes using go-retryablehttp client Jul 3, 2023
@vasireddy99 vasireddy99 force-pushed the prw5xx branch 7 times, most recently from 1cc170d to a03bc9f Compare July 5, 2023 17:35
@Aneurysm9 Aneurysm9 added the ready to merge Code review completed; ready to merge by maintainers label Aug 15, 2023
@dmitryax
Copy link
Member

Do you want to remove exporterhelper.WithRetry from the factory, given that it's handled separately now?

@vasireddy99
Copy link
Contributor Author

vasireddy99 commented Aug 21, 2023

Do you want to remove exporterhelper.WithRetry from the factory, given that it's handled separately now?

I would leave the setting for now and can take the follow up on this. As I don't want to disable retries in this PR.

@rapphil
Copy link
Contributor

rapphil commented Aug 21, 2023

@vasireddy99 I think what @dmitryax suggested is reasonable and valid to include in the scope of this PR: we never return a retryable error and this component cannot handle retries in the helper level, therefore we should remove this capability of this component here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/prometheusremotewriteexporter/factory.go#L60

@vasireddy99
Copy link
Contributor Author

@vasireddy99 I think what @dmitryax suggested is reasonable and valid to include in the scope of this PR: we never return a retryable error and this component cannot handle retries in the helper level, therefore we should remove this capability of this component here: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/prometheusremotewriteexporter/factory.go#L60

Makes sense, I agree with that to, but wanted to discuss and also double nthe usage. Will commit the changes

@vasireddy99
Copy link
Contributor Author

Commited the changes.
This Failure is not related to the PR- quick note.

@dmitryax
Copy link
Member

@vasireddy99, doesn't seem like anything new was pushed, just merged main

@vasireddy99
Copy link
Contributor Author

@vasireddy99, doesn't seem like anything new was pushed, just merged main

Thanks @dmitryax, Please review

@vasireddy99
Copy link
Contributor Author

This PR #24729 shall also be review/merged after this PR is merged. Just adding the note.

@dmitryax dmitryax closed this Aug 29, 2023
@dmitryax dmitryax reopened this Aug 29, 2023
@codecov
Copy link

codecov bot commented Aug 29, 2023

Codecov Report

Patch coverage is 87.75% of modified lines.

Files Changed Coverage
exporter/prometheusremotewriteexporter/factory.go ø
exporter/prometheusremotewriteexporter/exporter.go 87.75%

📢 Thoughts on this report? Let us know!.

@dmitryax dmitryax merged commit 8715a5c into open-telemetry:main Aug 29, 2023
165 of 182 checks passed
@github-actions github-actions bot added this to the next release milestone Aug 29, 2023
@vasireddy99 vasireddy99 deleted the prw5xx branch August 29, 2023 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/prometheusremotewrite ready to merge Code review completed; ready to merge by maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants