Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NSCA host puts service into UNKNOWN state, but client seems operational #38

Open
1 task
phijor opened this issue Jul 23, 2021 · 0 comments
Open
1 task
Labels
bug Something isn't working

Comments

@phijor
Copy link
Collaborator

phijor commented Jul 23, 2021

We encountered an error where the the NSCA host displayed a service in state UNKNOWN since it did not receive check results for a long time. Nonetheless, the metricq-sink-nsca client seemed to run without any issue.
After restarting the client, the problemen vanished and the service state recovered.

This might have been caused by one the following:

  • there is a bug in metricq-sink-nsca where it continues to consume metric data, but does not send any new reports.
  • (unlikely after having a look at the NSCA host logs): metricq-sink-nsca was fully functional, successfully sending check results, but the NSCA host dropped them along the way

In the latter case, we should debug the problem by logging whenever a report was sent successfully. I released version 1.8.1 that includes more log messages when reports are sent.

TODO:

  • actually figure out why the client refuses to send reports randomly
@phijor phijor added the bug Something isn't working label Jul 23, 2021
phijor added a commit that referenced this issue Jul 23, 2021
This should make debugging issues like #38 a bit easier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant