0l-monitoring

This repository provides guides for both monitoring providers and node operators.

Monitoring providers [MPs]: any party willing to provide monitoring services for 0L node operators by running monitoring tools such as Prometheus stack.
0L node operators [OPs]: any party running any type of 0L nodes (validator/VFN or fullnode) who want to minitor their nodes.

Prometheus Stack

Prometheus is an open source application which can scrap the real-time metrics to monitor events and also do real-time alerting.

Grafana is an analytical and visualization tool which is helpful to create interactive charts & graphs from the data and alerts scraped from the monitoring tools.

0L diem node exports set of Prometheus metrics that we would like to collect and use to build Grafana dashboards. These are exported on ports 9101 and 9012. In addition to diem metrics, node operators can choose to expose system metrics like CPU, memory, storage, and others using Prometheus Node Exporter .

Guides on how to set up Prometheus and Grafana instances can be found here:

Prometheus [MPs]
Grafana [MPs]

As for node operators they can follow the steps below to allow monitoring providers to collect metrics from their hosts.

Modifications to 0L hosts [OPs]

Pick your monitoring provider from the list below
Open ports 9100-9101 to $PROMETHEUS_STATIC_IP (and probably to your own IP as well)

Depending on your host and firewall, you might need to enable that on different places; ufw, Digital Ocean Firewall, AWS Security Groups, etc.
Install Node Exporter This assumes you are running Ubuntu
```
sudo apt update
sudo apt install prometheus-node-exporter
```
or use manual setup
Confirm these endpoints are working
- curl http://YOUR-IP:9100/metrics
- curl http://YOUR-IP:9101/metircs
Share your validator account address, host IP(s), and a Discord handle with the monitoring provider

Grafana Dashboards

Example dashboards from Bᴺ 𝕊pace.

Ol Move

http://grafana.openlibra.space:3000/d/0l-move/0l-move
0L Node

http://grafana.openlibra.space:3000/d/0l-node/0l-node
System Monitoring

http://grafana.openlibra.space:3000/d/rYdddlPWk/system-monitoring?orgId=1&refresh=1m

Monitoring Providers

1. Bᴺ 𝕊pace

Prometheus
Static IP: 85.215.101.127

Grafana
Url      : https://grafana.openlibra.space
Auth     : `viewer:viewer` (view only)

Discord: @nourspace#6652

Todo

Add specific todos for Prometheus and Grafana setup guides
Consider using K8s operators and/or Helm charts to run Prometheus stack
- Use HTTPs and load balancers
Link to and/or integrate other monitoring tools built by the 0L community
- Enable alerting on Grafana dashboards

Legacy

Some tasks and question from the Hackmd document that need to be integrated in the current todos.

https://hackmd.io/9dxv7ZwYS1yOmBVSjSV2wg

Questions (old)

Security: We want to create our own node-exporter config to only send meaningful and safe system metrics.
Decentralization: We are running the two instances on our own for now, but thinking how to move this forward where there is no single point of failure neither a single entity hosting everyone's metrics.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
grafana		grafana
prometheus		prometheus
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

0l-monitoring