-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU operator 22.9.2 installation is failing #541
Comments
Adding few more details. /run/nvidia/driver# ls -lhrt ls -la /usr/local/nvidia/toolkit ls -la /run/nvidia |
@likku123 can you attach logs from |
driver.log We have three nodes and here are the logs for nodes daemonset logs |
Looks like |
Yes, That works . Thanks for your help. |
1. Quick Debug Checklist
1. Issue or feature description
I am trying to install specific version of GPU operator (22.9.2) via helm chart using ansible. Previously I am not specfying the version number and installing the latest . Just to be on a safer side I have specified the specific version to deploy.
I have collected logs based on the below instructions.
-->curl -o must-gather.sh -L https://raw.githubusercontent.com/NVIDIA/gpu-operator/master/hack/must-gather.sh
-->chmod +x must-gather.sh
-->./must-gather.sh
gpu_operand_pod_nvidia-container-toolkit-daemonset-tx9zk.zip
[gpu_operand_pod_gpu-feature-discovery-57ds2.log](https://github.com/NVIDIA/gpu-operator/files/11784844/gpu_
gpu_operand_pod_nvidia-operator-validator-5zwwb.zip
gpu_operand_pod_nvidia-dcgm-exporter-5fl25.log
operand_pod_gpu-feature-discovery-57ds2.log)
Please let me know any more logs are required from my side
The text was updated successfully, but these errors were encountered: