-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue]: Stable diffusion, Pytorch conv2d breaks in rocm 6.0 #3418
Comments
Same issue. |
Yea, I am able to reproduce this error using the installation step from https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs#user-content-install-on-amd-and-arch-linux. Can you guys try reinstalling |
@AphidGit @Kamishirasawa-keine Please try @alexxu-amd suggestion above. Thanks! |
That suggestion breaks things even more.
|
Unfortunately I'm still getting the same error with rocm 5.7 / rocm 6.0, python 3.10, when attempting to use stable-diffusion-webui. Did anyone ever get a solution to this? 7900xtx
|
Okay, well, restarting fixed the issue for myself. If you haven't tried that I guess definitely do, even if host system libraries haven't changed. It appears that sometimes a previous GPU compute operation doesn't end or close properly, and it generates seemingly random errors until reset fully. |
I can't reproduce this on a fresh install of Arch with ROCm 6.0.2 (from pacman) on a 7900XTX, but I did have to modify the installation process slightly as the instructions in https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs#user-content-install-on-amd-and-arch-linux weren't working out of the box. The changes I made:
I am also using the command-line args @AphidGit @Kamishirasawa-keine Are you still experiencing this issue? If so, can you try switching your torch packages for these? |
Problem Description
Running stable diffusion webui, worked when rocm was version 5.7. Version 6.0, updated feb 15, breaks this. While I had the occasional hiccup, lockup or reboot before with v5.7, it was fairly stable and could produce images. Version 6.0 will crash upon trying to load any non-trivial data into the gpu consistently.
It reports the following stack traces to me. Somewhere in between, I can see a, probably from a different thread, runtimeError. When loading multiple models (such as when using Low-Rank adaptations), I get a RuntimeError for each one.
Digging further, I found that using the environment variable AMD_LOG_LEVEL and setting it higher (anything higher than zero was enough, so try
env AMD_LOG_LEVEL=1
) gave me another clue;I edited the code of webui to put a little 'press any key' prompt in, and attached gdb, then made it break at that line. Here's a full backtrace. Involved are the following things:
Operating System
Arch linux, kernel 6.7.4-arch1-1
CPU
AMD Threadripper 1950X
GPU
AMD Radeon RX 7900 XTX
ROCm Version
ROCm 6.0.0
ROCm Component
No response
Steps to Reproduce
To reproduce;
1* Create a venv. enter it.
2* Install stable diffusion webui, following https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs#user-content-install-on-amd-and-arch-linux
3* Download any sd model and place in models folder.
4* either ./webui.sh or python launch.py
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
Additional Information
No response
The text was updated successfully, but these errors were encountered: