A real-time lightweight NVIDIA GPU monitoring dashboard built with Docker for easy deployment and cross-platform compatibility.
gpu-monitor.mp4
Features
Prerequisites
Quick Start
Installation Prerequisites
Building gpu-monitor from Source
Configuration
Alternative Setup Method
Data Persistence
Alerts
Troubleshooting
License
- Real-time GPU metrics monitoring
- Interactive web dashboard
- Historical data tracking (15m, 30m, 1h, 6h, 12h, 24h)
- Temperature, utilization, memory, and power monitoring
- Docker-based for easy deployment
- Persist history between new containers
- Real time alerts - sound and notification
- Responsive theme for any size screen
- Toggle gauges on or off to show metrics in graph
- Docker
- NVIDIA GPU
- NVIDIA Container Toolkit
Test to see if you already have the requirements and ready to use.
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
If this failed proceed to Installation Prerequisites
docker run -d \
--name gpu-monitor \
-p 8081:8081 \
-e TZ=America/Los_Angeles \
-v /etc/localtime:/etc/localtime:ro \
-v ./history:/app/history \
-v ./logs:/app/logs \
--gpus all \
--restart unless-stopped \
bigsk1/gpu-monitor:latest
Note: Update your timezone to use the correct time
- Clone the repository:
git clone https://github.com/bigsk1/gpu-monitor.git
cd gpu-monitor
- Start the container:
docker-compose up -d
- Access the dashboard at: http://localhost:8081
Windows users make sure you have wsl with docker an easy way is Docker Desktop Installation for Windows
Installing with apt add NVIDIA package repositories
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
The nvidia-ctk command modifies the /etc/docker/daemon.json file on the host. The file is updated so that Docker can use the NVIDIA Container Runtime.
sudo systemctl restart docker
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
For other distributions, check the official documentation.
- Clone the repository:
git clone https://github.com/bigsk1/gpu-monitor.git
cd gpu-monitor
- Build the image:
docker build -t gpu-monitor .
- Run the container:
docker run -d \
--name gpu-monitor \
-p 8081:8081 \
-e TZ=America/Los_Angeles \
-v /etc/localtime:/etc/localtime:ro \
-v ./history:/app/history \
-v ./logs:/app/logs \
--gpus all \
--restart unless-stopped \
gpu-monitor
The dashboard is accessible at: http://localhost:8081
by default. To change the port, modify the docker-compose.yml
file or the -p
parameter in the docker run command.
A setup script is provided for convenience. It checks prerequisites and manages the service:
- If you have issues then make sure
setup.sh
is executable
chmod +x ./setup.sh
- Check prerequisites and start the service
./setup.sh start
- Stop the service
./setup.sh stop
- Restart the service
./setup.sh restart
- Check service status
./setup.sh status
- View logs
./setup.sh logs
Example of script running
~/gpu-monitor ./setup.sh start
[+] Checking prerequisites...
[+] Docker: Found
[+] Docker Compose: Found
[+] NVIDIA Docker Runtime: Found
[+] NVIDIA GPU: Found
[+] Starting GPU Monitor...
Creating network "gpu-monitor_default" with the default driver
Creating gpu-monitor ... done
[+] GPU Monitor started successfully!
[+] Dashboard available at: http://localhost:8081
[+] To check logs: docker-compose logs -f
By default, all data is stored within the container will persist between container rebuilds, if you don't want that then remove volumes, modify the docker run or docker-compose.yml:
services:
gpu-monitor:
# ... other settings ...
volumes:
- ./history:/app/history # Remove Persist historical data
- ./logs:/app/logs # Remove Persist logs
You can enable or disable alerts in ui, you can set thresholds for gpu temp, gpu utilization % and watts. Setting are saved in your browser if you make changes you only need to do it once, however you can always modify the code and rebuild the container to make it permanent.
The defaults are:
temperature: 80,
utilization: 100,
power: 300
- NVIDIA SMI not found
- Ensure NVIDIA drivers are installed
- Verify NVIDIA Container Toolkit installation
- Make sure you can run:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
If this failed proceed to Installation Prerequisites
-
Container fails to start
- Check Docker logs:
docker logs gpu-monitor
- Verify GPU access:
nvidia-smi
- Ensure proper permissions
- Check Docker logs:
-
Dashboard not accessible
- Verify container is running:
docker ps
- Check container logs:
docker logs gpu-monitor
- Ensure port 8081 is not in use
- Verify container is running:
-
TimeStamps don't match your local time
- Replace
America/Los_Angeles
with your timezone List of tz database time zones
- Replace
-
I don't like the alert sound
- Replace the .mp3 in the
sounds
folder and name it alert.mp3 - Getting double sounds from notifications, disable windows notifications or disable it in ui.
- Replace the .mp3 in the
-
Updated Nvidia driver and UI metrics seem stuck
- The connection to the nvidia runtime was interrupted, restart the container.
Docker Scout Score