Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

think about adding monitoring to the stats server #278

Open
alsuren opened this issue Sep 8, 2024 · 1 comment
Open

think about adding monitoring to the stats server #278

alsuren opened this issue Sep 8, 2024 · 1 comment

Comments

@alsuren
Copy link
Collaborator

alsuren commented Sep 8, 2024

          I think we could use tracing + + tracing-subscriber + tracing-appender for logging.

tracing is like the crate log, while tracing-appender provides non-blocking writer implementation (to prevent blocking the web server), and tracing-subscriber provides the logging implementation.

tracing-subscriber could also print out json as well to make it machine readable.

Originally posted by @NobodyXu in #165 (comment)

It might be that we run with it as it is for a while and decide that it's fine as it is, but it feels like everyone is using the tracing crate these days, so I would be interested in seeing how it works for us.

@alsuren
Copy link
Collaborator Author

alsuren commented Sep 22, 2024

Leaving a note here because it feels related, and probably doesn't needs its own issue:

fly.io has pretty good monitoring out of the box (prometheus + grafana). I noticed that our response times were pretty bad (median never below 100ms, p99 up to 500ms at some timescales).

I realised that the stats server is deployed in fly.io's lhr region (named after London Heathrow - they all seem to be named after the nearest airport or something?). This was fine on the old influxdb cloud instance, but the new one is in us-east-1.

I ran the following commands:

fly scale count 1 --region=iad
fly scale count 0 --region=lhr

This was based on this forum post: https://community.fly.io/t/change-region-for-an-app/18888. iad is the closest region to us-east-1 - https://fly.io/docs/reference/regions/

Looking at the graphs again, this has brought things more under control (15ms to 200ms).

Screenshot 2024-09-22 at 16 24 54 -- https://fly-metrics.net/d/fly-app/fly-app?from=1727002324300&to=1727031124300&var-app=cargo-quickinstall-stats-server&orgId=179329&viewPanel=13 (shout if you want access)

I'll take another glance at it later today to make sure it stays that way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant