Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beat hangs under elastic-agent when it recieves a sigint with add_cloud_metadata processor #34248

Open
fearful-symmetry opened this issue Jan 12, 2023 · 2 comments
Labels
8.8-candidate bug Team:Cloud-Monitoring Label for the Cloud Monitoring team

Comments

@fearful-symmetry
Copy link
Contributor

This is an odd one, filing it before I have the full picture, since I'm balancing a few things.

Reproduced with the following setup:

  • Current build of metricbeat and elastic-agent
  • default elastic-agent.yml
  • running on linux
  • run in standalone mode with ./elastic-agent
  • After a few seconds, send elastic-agent a sigint with ^C
  • Instead of shutting down in the usual 3-5 seconds, elastic-agent will continue running for 30 seconds until agent hard-stops the beat.

Once metricbeat gets the sigint, instead of shutting down, it seems to end up in some kind of loop where it will continually restart the reloader with the cloud metadata processor:

{"log.level":"debug","@timestamp":"2023-01-12T13:23:27.516-0800","message":"Generated new processors: add_cloud_metadata={}, add_fields={\"@metadata\":{\"input_id\":\"unique-system-metrics-input\"}}, add_fields={\"data_stream\":{\"dataset\":\"generic\",\"namespace\":\"default\",\"type\":\"metrics\"}}, add_fields={\"event\":{\"dataset\":\"generic\"}}, add_fields={\"elastic_agent\":{\"id\":\"5291b614-c8ba-4c73-8e81-4cc09cfdcc44\",\"snapshot\":false,\"version\":\"8.7.0\"}}, add_fields={\"agent\":{\"id\":\"5291b614-c8ba-4c73-8e81-4cc09cfdcc44\"}}","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log.logger":"processors","log.origin":{"file.line":121,"file.name":"processors/processor.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2023-01-12T13:23:30.519-0800","message":"read token request for getting IMDSv2 token returns empty: Put \"http://169.254.169.254/latest/api/token\": context deadline exceeded (Client.Timeout exceeded while awaiting headers). No token in the metadata request will be used.","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log.logger":"add_cloud_metadata","log.origin":{"file.line":81,"file.name":"add_cloud_metadata/provider_aws_ec2.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2023-01-12T13:23:30.519-0800","message":"add_cloud_metadata: starting to fetch metadata, timeout=3s","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"service.name":"metricbeat","ecs.version":"1.6.0","log.logger":"add_cloud_metadata","log.origin":{"file.line":130,"file.name":"add_cloud_metadata/providers.go"},"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2023-01-12T13:23:33.522-0800","message":"add_cloud_metadata: received disposition for gcp after 3.002658035s. result=[provider:gcp, error=failed requesting gcp metadata: Get \"http://169.254.169.254/computeMetadata/v1/?recursive=true&alt=json\": dial tcp 169.254.169.254:80: i/o timeout, metadata={}]","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log.logger":"add_cloud_metadata","log.origin":{"file.line":167,"file.name":"add_cloud_metadata/providers.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2023-01-12T13:23:33.522-0800","message":"add_cloud_metadata: timed-out waiting for all responses","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"service.name":"metricbeat","ecs.version":"1.6.0","log.logger":"add_cloud_metadata","log.origin":{"file.line":174,"file.name":"add_cloud_metadata/providers.go"},"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2023-01-12T13:23:33.522-0800","message":"add_cloud_metadata: fetchMetadata ran for 3.002800745s","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log.logger":"add_cloud_metadata","log.origin":{"file.line":133,"file.name":"add_cloud_metadata/providers.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-01-12T13:23:33.522-0800","message":"add_cloud_metadata: hosting provider type not detected.","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"service.name":"metricbeat","ecs.version":"1.6.0","log.logger":"add_cloud_metadata","log.origin":{"file.line":102,"file.name":"add_cloud_metadata/add_cloud_metadata.go"},"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2023-01-12T13:23:33.522-0800","message":"Generated new processors: add_cloud_metadata={}, add_fields={\"@metadata\":{\"input_id\":\"unique-system-metrics-input\"}}, add_fields={\"data_stream\":{\"dataset\":\"generic\",\"namespace\":\"default\",\"type\":\"metrics\"}}, add_fields={\"event\":{\"dataset\":\"generic\"}}, add_fields={\"elastic_agent\":{\"id\":\"5291b614-c8ba-4c73-8e81-4cc09cfdcc44\",\"snapshot\":false,\"version\":\"8.7.0\"}}, add_fields={\"agent\":{\"id\":\"5291b614-c8ba-4c73-8e81-4cc09cfdcc44\"}}","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"system/metrics-default","type":"system/metrics"},"log.logger":"processors","log.origin":{"file.line":121,"file.name":"processors/processor.go"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}

[repeat the above log lines 3-4 times until agent hard-stops the beat]

Commenting out the processors here fixes the issue. This is not reproducible with standalone metricbeat as well, the weird init-loop-after-sigint only seems to happen while we're running under agent.

@fearful-symmetry fearful-symmetry added bug Team:Elastic-Agent Label for the Agent team labels Jan 12, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@cmacknz cmacknz added Team:Cloud-Monitoring Label for the Cloud Monitoring team 8.8-candidate labels Jan 13, 2023
@botelastic
Copy link

botelastic bot commented Jan 13, 2024

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jan 13, 2024
@jlind23 jlind23 removed the Team:Elastic-Agent Label for the Agent team label Mar 14, 2024
@botelastic botelastic bot removed the Stalled label Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.8-candidate bug Team:Cloud-Monitoring Label for the Cloud Monitoring team
Projects
None yet
Development

No branches or pull requests

4 participants