Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka exporter: kafka_consumergroup_lag #437

Open
robertos1232 opened this issue Apr 19, 2024 · 9 comments
Open

Kafka exporter: kafka_consumergroup_lag #437

robertos1232 opened this issue Apr 19, 2024 · 9 comments

Comments

@robertos1232
Copy link

Kafka exporter Version 1.6
Kafka Cluster 8 brokers
topic with 8 partition.

Question:

  • Why kafka exporter report in kafka_consumergroup_lag negative values?

image

@JasirVoriya
Copy link

The same for me.

@sky-dadan
Copy link

I have the same question.

@razorness
Copy link

I also experienced high lag spikes where summarized lag of single consumers toggle between 500 and 0 while topic receives ~20 messages per minute.

@henry-ahn0
Copy link

I have same problem. 🤔

@dramosOptiply
Copy link

How is this not given a priority yet? This causes issues in production and breaks Kubernetes HPA because it does not handle negative metric values when calculating the replica count.
This completely breaks the scaling algorithm.

@Sway23
Copy link

Sway23 commented Oct 10, 2024

How is this not given a priority yet? This causes issues in production and breaks Kubernetes HPA because it does not handle negative metric values when calculating the replica count. This completely breaks the scaling algorithm.

How is this an issue with the Kafka Exporter? This is a metric your Kafka cluster is coming up with, in turn read by the Kafka Exporter.

@isaitgirl
Copy link

Same problem here
Kafka 2.6
kafka_exporter 1.8

@robertos1232
Copy link
Author

@danielqsj Could you please take a look on this.

@dSohaliya
Copy link

I think the issue is because of we are checking for partition offset and storing it before we are calculating consumer group lag. So it might be the case that at the time of calculating consumer group lag, the consumer group offset, and partition offsets are increased. but for partition offset we are still picking old values.

Here is the PR #466
Please let me know if you find any issues here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants