Docker Compose file to set up NVIDIA GPU monitoring on a single server using DCGM-Exporter, Prometheus, and Grafana.
- NVIDIA Container Toolkit
- Docker Compose. Please make sure your docker-compose supports Compose file format version 3.x.
Run the following command to launch containers:
docker-compose up -d
# or `docker compose up -d` if docker-compose is installed as a Docker CLI plugin
Then you can access http://localhost:3000 for Grafana Dashboard (default username: admin, password: admin).
Run the following command to stop and remove containers:
docker-compose down
# or `docker compose down` if docker-compose is installed as a Docker CLI plugin
This command will not delete docker volumes, so the data is still persisted on the server unless you manually delete the volumes.
The Compose file contains several environment variables to allow users to populate values inside the Compose file:
Environment Variable | Explanation | Default Value |
---|---|---|
DCGM_EXPORTER_IMAGE_TAG | Docker tag for dcgm-exporter image | 2.4.6-2.6.10-ubuntu20.04 |
PROMETHEUS_IMAGE_TAG | Docker tag for prometheus image | v2.36.1 |
GRAFANA_IMAGE_TAG | Docker tag for grafana image | 8.5.6 |
DCGM_EXPORTER_HOST_PORT | Host port for dcgm-exporter container | 9400 |
PROMETHEUS_HOST_PORT | Host port for prometheus container | 9090 |
GRAFANA_HOST_PORT | Host port for grafana container | 3000 |
GRAFANA_ADMIN_USER | Admin username of grafana | admin |
GRAFANA_ADMIN_PASSWORD | Admin password of grafana | admin |
PROMETHEUS_STORAGE_TSDB_RETENTION_TIME | storage.tsdb.retention.time config of prometheus |
30d |
The default values of these environment variables are put inside the .env
file. You can modify the .env
file to change these configurations. Alternatively, you can specify your own Environment file by providing --env-file
option when running docker-compose
command.
Shell environment variables, which has a higher priority than Environment file, can also be set to override these values, e.g. GRAFANA_HOST_PORT=13000 docker-compose up -d
.