You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Environment, CPU architecture, OS, and Version:
Darwin c13.localdomain 22.5.0 Darwin Kernel Version 22.5.0: Mon Apr 24 20:52:43 PDT 2023; root:xnu-8796.121.2~5/RELEASE_ARM64_T8112 arm64
M2 Macbook Air
Describe the bug
Using metal fails to load the model properly. It appears that llama.cpp always thinks the model is GGML_TYPE_F32.
To Reproduce
make BUILD_TYPE=metal build
cp go-llama/llama.cpp/ggml-metal.metal .
Run local-ai with a model config that uses a Q4_0 model and sets gpu_layers: 1 in the model config. Send a curl command that uses that model
Expected behavior
I expect the model to run using metal and provide output back to the user
I also filed this issue for go-llama.cpp: go-skynet/go-llama.cpp#91
It may be related, but once I got the go-llama.cpp example to compile with metal support it worked fine.
The text was updated successfully, but these errors were encountered:
LocalAI version:
6bb5622
Environment, CPU architecture, OS, and Version:
Darwin c13.localdomain 22.5.0 Darwin Kernel Version 22.5.0: Mon Apr 24 20:52:43 PDT 2023; root:xnu-8796.121.2~5/RELEASE_ARM64_T8112 arm64
M2 Macbook Air
Describe the bug
Using metal fails to load the model properly. It appears that llama.cpp always thinks the model is GGML_TYPE_F32.
To Reproduce
Run local-ai with a model config that uses a Q4_0 model and sets
gpu_layers: 1
in the model config. Send a curl command that uses that modelExpected behavior
I expect the model to run using metal and provide output back to the user
Logs
./local-ai --config-file ~/model/localai.yaml --models-path ~/model --debug
Additional context
I also filed this issue for go-llama.cpp: go-skynet/go-llama.cpp#91
It may be related, but once I got the go-llama.cpp example to compile with metal support it worked fine.
The text was updated successfully, but these errors were encountered: