Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support application performance management (APM) #153

Open
pbolduc opened this issue Apr 7, 2023 · 1 comment
Open

Support application performance management (APM) #153

pbolduc opened this issue Apr 7, 2023 · 1 comment

Comments

@pbolduc
Copy link
Contributor

pbolduc commented Apr 7, 2023

Is your feature request related to a problem? Please describe.

When running COMS in OpenShift, there is no easy way to monitor the heath of a COMS deployment.

  • readiness / health check end point
  • metrics that are exposed as Prometheus metrics that can be scraped by sysdig or similar metrics system,
  • logs should be configurable to be sent to other locations like Splunk
  • logs should be less verbose, every single request that completes is logged out, this produces too much noise and data.
  • a lot of the verbose log data could be exposed as metrics (histogram for operation elapsed time, with tags for operation name, http status code, http verb).

See

@pbolduc pbolduc changed the title Provide instrumentation and diagnostics Support application performance management (APM) Apr 7, 2023
@TimCsaky
Copy link
Contributor

Update:
The hosted COMS service uses a GitHub actions based pipeline with OpenShift deployment templates (managed by Helm). These are included in the COMS repo. (see .github and charts directories)
The app containers do have configured liveness/readiness checks by calling the root path. We use the bc gov sysdig service to log/alert on those failing. I documented our Sysdig set-up for our hosted API's. But perhaps we do need a dedicated 'health' endpoint. I will raise that with the devs.

We're using the Express/Winston logging middleware that allows for different logging output levels. When we get time we would like to include a fluent-bit container that sends application, access, error logs etc to different outputs.
For now we only monitor http errors using Sysdig.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants