-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics for DNS probe failed? #739
Comments
You can determine this by probe_icmp_duration_seconds not having a resolve time. More generally, if DNS resolution fails then it is correct that the whole ICMP probe fails as a DNS outage is a serious problem and should generate alerts. It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided. |
HI Brian, Thanks for the details. I agree that the DNS outage is a serious problem but because of DNS outage , ICMP check for all the host will fail and its kind of a false Alert . Also you have 1000s of server getting monitored by blackbox(ICMP) it will create false alerts. Regards, |
I'd suggest setting up your group_by in alertmanager so that you get only a single notification with all the alerts firing, rather than a notification per alert. |
Hi Brian, Thanks for the suggestion ! My scenario is to have alert for each host and group_by doesn't help . How i can co-relate probe_icmp_duration_seconds and probe_success . i want to create host down alert only if host got resolve. I was trying to create a query something like the below Do you think it is possible? Regards, |
That's a PromQL usage question best taken to the mailing list. |
Ok . i have raised my question in the mailing list but not sure how fast i will get reply. |
Hi,
Its regarding the icmp check
If i am not wrong , blackbox exporter will do a dns probe and then does the icmp check.
In case if the dns probe is getting failed due to any reason(one such reason would be due to the limit in the docker concurrent request-moby/libnetwork#2601) then blackbox expoter will consider the icmp check failed for that host. But here the actual issue is on the dns side and host are reachable. This could ideally generate lot of alerts. can we have a mertics for dns probe status as well ?
Regards,
Siju
The text was updated successfully, but these errors were encountered: