Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup monitoring of ECS nodes #104

Open
2 tasks
hellais opened this issue Sep 25, 2024 · 2 comments
Open
2 tasks

Setup monitoring of ECS nodes #104

hellais opened this issue Sep 25, 2024 · 2 comments

Comments

@hellais
Copy link
Member

hellais commented Sep 25, 2024

Currently we don't have observability into the container host of the ECS cluster. Moreover we are only able to scrape aggregate metrics from the services that are behind the balancer, which means we end up with the metrics "flapping".

Ideally we would have a way of scraping metrics for the container host, but also the per-servicer docker containers.

In summary we would like to collect two classes of metrics:

  • Host container metrics (the ec2 nodes that run docker and we deploy docker containers to), using node_exporter
  • Docker container application metrics, which are exposed using the instrumentator and we would like to scrape independently per each host container
@hellais hellais self-assigned this Sep 25, 2024
@DecFox DecFox self-assigned this Oct 7, 2024
@hellais hellais added the epic label Dec 9, 2024
@hellais hellais moved this to Backlog in Sprint Planning Jan 13, 2025
@hellais hellais assigned LDiazN and unassigned hellais and DecFox Jan 22, 2025
@LDiazN
Copy link
Contributor

LDiazN commented Jan 23, 2025

I think this might be the way: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config

The problem I'm seeing right now is that the monitor server is not in AWS, so we have some issues with the connection between that server and the ec2 instances:

  1. We have to set up IAM credentials for the server, as mentioned in the link above
  2. The prometheus server needs a way to reach the ec2 instances (not the load balancer), but they're probably not open to internet traffic (and I don't think they should), what can we do about this?

@hellais hellais moved this from Backlog to Sprint Backlog in Sprint Planning Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Sprint Backlog
Development

No branches or pull requests

3 participants