-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-state-metrics API scraping timeout #995
Comments
Thanks for the issue.
Seems like your use case is very specific, have you tried disabling the pod metrics resource from kube-state-metrics? Does it make a difference? |
Or potentially configure kube-state-metrics to not watch the namespaces with the problematic workload. |
@lilic Thanks for the response. Yeah agree our use case is special. I stood up my local kube-state-metrics with some timing instrumentation and it's the pod collector that takes majority of the time, so I'm pretty sure it'll get things fixed, but I don't think that's an option for us. @brancz Thanks for the info, that might be something we can look at if no other solutions. The namespaces can also be dynamic so it won't be ideal. I'm now leaning toward exploring using completely lock-free maps to enahnce kube-state-metrics, as |
@zhengl7 note that we have benchmark tests in place and some of those things were done for optimisation purposes, so if the performance decreases we won't accept the PR. But we are happy to see any PRs that solve bugs :) |
@lilic Absolutely. I just got a version that solves our problem, it reduces the scraping call duration from tens of seconds to subsecond during those peak traffic minutes, on the same cluster. I'll try running the test scripts locally. |
@lilic Just submitted the PR about the proposed change, the fix ends up being just using I ran |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There are a few documented scenarios where `kube-state-metrics` will lock up(kubernetes#995, kubernetes#1028). I believe a much simpler solution to ensure `kube-state-metrics` doesn't lock up and require a restart to server `/metrics` requests is to add default read and write timeouts and to allow them to be configurable. At Grafana, we've experienced a few scenarios where `kube-state-metrics` running in larger clusters falls behind and starts getting scraped multiple times. When this occurs, `kube-state-metrics` becomes completely unresponsive and requires a reboot. This is somewhat easily reproduceable(I'll provide a script in an issue) and causes other critical workloads(KEDA, VPA) to fail in weird ways. Adds two flags: - `server-read-timeout` - `server-write-timeout` Updates the metrics http server to set the `ReadTimeout` and `WriteTimeout` to the configured values.
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
We are using kube-state-metrics (1.6.0) from prometheus-operator. Our cluster periodically creates hundreds of pods and deletes them within a 15 minutes window for bursting traffic. When this happens, kube-state-metrics target in Prometheus will often see API timeout (default 10 seconds). Suspecting this is due to the mutex locking mechanism used for
MetricsStore
that when heavy writing of new entries into the map happens it'll block reading for extended time.What you expected to happen:
More consistent API response time to not cause Prometheus scraping timeout.
How to reproduce it (as minimally and precisely as possible):
Our cluster normally has 30k metrics in total, with roughly 25k metrics are from pod collectors (seems there are 27 pod_info_ metrics, we have close to 1000 pods). When the bursting traffic comes, the cluster create a few hundreds of pods to push total metrics to around 60k, most added in pod_info metrics.
Response size doesn't seem to be an issue, with 60k metrics, the total response size is roughly 10MB. When the scraping call doesn't time out, it can response in 100 ~ 200ms.
Single kube-state-metrics pod.
Anything else we need to know?:
Tried modifying kube-state-metrics code by using a
sync.Map
forMetricsStore
in hope of better concurrency, it seems relieving the problem a little bit, but when the flood comes in it can still take seconds for kube-state-metrics to respond, assync.Map
is really good for infrequently changed map entries but in our case there are always a ton of new entries being created.I think padding the gap in Prometheus with previous sample should be fine, not sure if it makes sense for kube-state-metrics to provide this behavior, something like keeping a copy of previous response, and if the collector reading takes too long simply using the copy.
Environment:
kubectl version
):1.13
1.6.0
The text was updated successfully, but these errors were encountered: