KoboldCPP-v1.56.yr0-ROCm

Windows build does not contain the Vulkan backend yet.

NEW: Added early support for new Vulkan GPU backend by @0cc4m. You can try it out with the command --usevulkan (gpu id) or via the GUI launcher. Now included with the Windows and Linux prebuilt binaries.

Updated and merged the new GGML backend rework from upstream. This update includes many extensive fixes, improvements and changes across over a hundred commits. Support for earlier non-gguf models has been preserved via a fossilized earlier version of the library. Please open an issue if you encounter problems. The Wiki and Readme have been updated too.

Added support for setting dynatemp_exponent, previously was defaulted at 1.0. Support added over API and in Lite.

Fixed issues with Linux CUDA on Pascal, added more flags to handle conda and colab builds correctly.

Added support for Old CPU fallbacks (NoAVX2 and Failsafe modes) in build targets in the Linux prebuilt binary (and koboldcpp.sh)

Added missing 48k context option, fixed clearing file selection, better abort handling support, fixed aarch64 termux builds, various other fixes.

Updated Kobold Lite with many improvements and new features:

NEW: Added XTTS API Server support (Local AI powered text-to-speech).

Added option to let AI impersonate you for a turn in a chat.

HD image generation options.

Added popup-on-complete browser notification options.

Improved DynaTemp wizard, added options to set exponent

Bugfixes, padding adjustments, A1111 parameter fixes, image color fixes for invert color mode.

To use on Windows, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller OR download koboldcpp_rocm_files.zip and run python koboldcpp.py (additional python pip modules might need installed, like customtkinter and tk or python-tk.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4 (-j4 can be adjusted to your number of CPU threads for faster build times)

For a full Linux build, make sure you have the OpenBLAS and CLBlast packages installed:
For Arch Linux: Install cblas openblas and clblast.
For Debian: Install libclblast-dev and libopenblas-dev.
then run make LLAMA_HIPBLAS=1 LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 -j4

If you're using NVIDIA, you can try koboldcpp.exe at LostRuin's upstream repo here
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller, also at LostRuin's repo.
To use on Linux, clone the repo and build with make LLAMA_HIPBLAS=1 -j4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KoboldCPP-v1.56.yr0-ROCm

KoboldCPP-v1.56.yr0-ROCm

Contributors