Change metrics store to use sync.Map #1028

zhengl7 · 2020-01-18T00:15:48Z

What this PR does / why we need it:
The cache map for metrics cannot be concurrently accessed in the current version, so there is a mutex for map reading/writing. This caused a problem that heavy metrics writing to the map will cause a contention with reading, and can potentially cause Prometheus scrape call to timeout. Using sync.Map can remove the need of the mutex.

The non-concurrent map implementation led to issues reported in #995, there is constant Prometheus scraping timeouts when the cluster reaches certain busyness of creating new pods. Applied a version of kube-state-metrics with this change to the same cluster, timeouts are completely gone.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #995

k8s-ci-robot · 2020-01-18T00:15:56Z

Welcome @zhengl7!

It looks like this is your first PR to kubernetes/kube-state-metrics 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kube-state-metrics has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2020-01-18T00:16:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zhengl7
To complete the pull request process, please assign lilic
You can assign the PR to them by writing /assign @lilic in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

zhengl7 · 2020-01-18T00:16:37Z

Results of running make test-benchmark-compare locally,

### Result
old=release-1.8 new=syncmap_metrics_store
benchmark                                       old ns/op       new ns/op       delta
BenchmarkKubeStateMetrics/GenerateMetrics-8     1014742988      1005576671      -0.90%
BenchmarkKubeStateMetrics/MakeRequests-8        10689884448     21356121974     +99.78%
BenchmarkPodStore-8                             28217           31236           +10.70%
BenchmarkMetricWrite/value-1-8                  565             551             -2.48%
BenchmarkMetricWrite/value-35.7-8               704             687             -2.41%

benchmark                                    old MB/s     new MB/s     speedup
BenchmarkKubeStateMetrics/MakeRequests-8     925.12       970.94       1.05x

benchmark                                       old allocs     new allocs     delta
BenchmarkKubeStateMetrics/GenerateMetrics-8     824843         864269         +4.78%
BenchmarkKubeStateMetrics/MakeRequests-8        410539         455643         +10.99%
BenchmarkPodStore-8                             488            523            +7.17%
BenchmarkMetricWrite/value-1-8                  7              7              +0.00%
BenchmarkMetricWrite/value-35.7-8               7              7              +0.00%

benchmark                                       old bytes       new bytes       delta
BenchmarkKubeStateMetrics/GenerateMetrics-8     83766392        87583568        +4.56%
BenchmarkKubeStateMetrics/MakeRequests-8        26515301456     52930885600     +99.62%
BenchmarkPodStore-8                             24304           25984           +6.91%
BenchmarkMetricWrite/value-1-8                  536             536             +0.00%
BenchmarkMetricWrite/value-35.7-8               536             536             +0.00%

The cache map for metrics cannot be concurrently accessed in the current version. This caused a problem that heavy metrics writing to the map will cause a contention with reading, and can potentially cause Prometheus scrape call to timeout. Using sync.Map can remove the mutex.

zhengl7 · 2020-01-18T00:35:48Z

Force pushing after rebasing

zhengl7 · 2020-01-18T00:36:02Z

/assign @lilic

zhengl7 · 2020-01-18T00:49:48Z

ok, tests failure, need to understand it better. Should I close the PR and come back later when it's ready?

lilic · 2020-01-20T10:54:54Z

hmm those do seem like very huge performance degradation. 🤔

There are a few documented scenarios where `kube-state-metrics` will lock up(kubernetes#995, kubernetes#1028). I believe a much simpler solution to ensure `kube-state-metrics` doesn't lock up and require a restart to server `/metrics` requests is to add default read and write timeouts and to allow them to be configurable. At Grafana, we've experienced a few scenarios where `kube-state-metrics` running in larger clusters falls behind and starts getting scraped multiple times. When this occurs, `kube-state-metrics` becomes completely unresponsive and requires a reboot. This is somewhat easily reproduceable(I'll provide a script in an issue) and causes other critical workloads(KEDA, VPA) to fail in weird ways. Adds two flags: - `server-read-timeout` - `server-write-timeout` Updates the metrics http server to set the `ReadTimeout` and `WriteTimeout` to the configured values.

In busy/large clusters, will prevent timeouts from long living locks/concurrency issues, as the writing to the map takes overly long, blocking the metrics-reading thread and as the lock doesn't get released in a timely manner, timing out the request. Inpired by previous PR at kubernetes#1028

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 18, 2020

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 18, 2020

k8s-ci-robot requested review from tariq1890 and zouyee January 18, 2020 00:16

zhengl7 requested a review from lilic January 18, 2020 00:18

zhengl7 force-pushed the syncmap_metrics_store branch from 620c817 to 61c94d9 Compare January 18, 2020 00:34

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 18, 2020

k8s-ci-robot assigned lilic Jan 18, 2020

zhengl7 closed this Jan 18, 2020

afirth mentioned this pull request Nov 21, 2023

Prometheus scrape not able to recover from tcp write broken pipe prometheus/prometheus#13094

Closed

This was referenced Jun 5, 2024

fix(server): Add read and write timeouts #2412

Merged

Scrapes hang when deadlocked #2413

Closed

rarruda mentioned this pull request Sep 26, 2024

perf: use concurrent map when storing metrics #2510

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change metrics store to use sync.Map #1028

Change metrics store to use sync.Map #1028

zhengl7 commented Jan 18, 2020

k8s-ci-robot commented Jan 18, 2020

k8s-ci-robot commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

lilic commented Jan 20, 2020

Change metrics store to use sync.Map #1028

Change metrics store to use sync.Map #1028

Conversation

zhengl7 commented Jan 18, 2020

k8s-ci-robot commented Jan 18, 2020

k8s-ci-robot commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

zhengl7 commented Jan 18, 2020

lilic commented Jan 20, 2020