Releases: withcatai/node-llama-cpp
v3.0.0-beta.20
3.0.0-beta.20 (2024-05-19)
Bug Fixes
Features
init
command to scaffold a new project from a template (withnode-typescript
andelectron-typescript-react
templates) (#217) (d6a0f43)- debug mode (#217) (d6a0f43)
- load LoRA adapters (#217) (d6a0f43)
- improve Electron support (#217) (d6a0f43)
Shipped with llama.cpp
release b2928
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.19
3.0.0-beta.19 (2024-05-12)
Bug Fixes
Features
Shipped with llama.cpp
release b2861
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.18
3.0.0-beta.18 (2024-05-09)
Bug Fixes
- more efficient max context size finding algorithm (#214) (453c162)
- make embedding-only models work correctly (#214) (453c162)
- perform context shift on the correct token index on generation (#214) (453c162)
- make context loading work for all models on Electron (#214) (453c162)
Features
- split gguf files support (#214) (453c162)
pull
command (#214) (453c162)stopOnAbortSignal
andcustomStopTriggers
onLlamaChat
andLlamaChatSession
(#214) (453c162)checkTensors
parameter onloadModel
(#214) (453c162)- improve Electron support (#214) (453c162)
Shipped with llama.cpp
release b2834
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v2.8.10
v3.0.0-beta.17
3.0.0-beta.17 (2024-04-24)
Bug Fixes
FunctionaryChatWrapper
bugs (#205) (ef501f9)- function calling syntax bugs (#205) ([ef501f9]
- show
GPU layers
in theModel
line in CLI commands (#205) ([ef501f9] - refactor: rename
LlamaChatWrapper
toLlama2ChatWrapper
Features
- Llama 3 support (#205) (ef501f9)
--gpu
flag in generation CLI commands (#205) (ef501f9)specialTokens
parameter onmodel.detokenize
(#205) (ef501f9)
Shipped with llama.cpp
release b2717
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.16
3.0.0-beta.16 (2024-04-13)
Bug Fixes
Features
inspect gpu
command: print device names (#198) (5ca33c7)inspect gpu
command: print env info (#202) (d332b77)- download models using the CLI (#191) (b542b53)
- interactively select a model from CLI commands (#191) (b542b53)
- change the default log level to warn (#191) (b542b53)
- token biases (#196) (3ad4494)
Shipped with llama.cpp
release b2665
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.15
3.0.0-beta.15 (2024-04-04)
Bug Fixes
- create a context with no parameters (#188) (6267778)
- improve chat wrappers tokenization (#182) (35e6f50)
- use the new
llama.cpp
CUDA flag (#182) (35e6f50) - adapt to breaking
llama.cpp
changes (#183) (6b012a6)
Features
- automatically adapt to current free VRAM state (#182) (35e6f50)
inspect gguf
command (#182) (35e6f50)inspect measure
command (#182) (35e6f50)readGgufFileInfo
function (#182) (35e6f50)- GGUF file metadata info on
LlamaModel
(#182) (35e6f50) JinjaTemplateChatWrapper
(#182) (35e6f50)- use the
tokenizer.chat_template
header from thegguf
file when available - use it to find a better specialized chat wrapper or useJinjaTemplateChatWrapper
with it as a fallback (#182) (35e6f50) - simplify generation CLI commands:
chat
,complete
,infill
(#182) (35e6f50) - Windows on Arm prebuilt binary (#181) (f3b7f81)
Shipped with llama.cpp
release b2608
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v2.8.9
v3.0.0-beta.14
3.0.0-beta.14 (2024-03-16)
Bug Fixes
DisposedError
was thrown when calling.dispose()
(#178) (315a3eb)- adapt to breaking
llama.cpp
changes (#178) (315a3eb)
Features
- async model and context loading (#178) (315a3eb)
- automatically try to resolve
Failed to detect a default CUDA architecture
CUDA compilation error (#178) (315a3eb) - detect
cmake
binary issues and suggest fixes on detection (#178) (315a3eb)
Shipped with llama.cpp
release b2440
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)
v3.0.0-beta.13
3.0.0-beta.13 (2024-03-03)
Bug Fixes
- adapt to
llama.cpp
breaking change (#175) (5a70576) - return user-defined llama tokens (#175) (5a70576)
Features
- gguf parser (#168) (bcaab4f)
- use the best compute layer available by default (#175) (5a70576)
- more guardrails to prevent loading an incompatible prebuilt binary (#175) (5a70576)
inspect
command (#175) (5a70576)GemmaChatWrapper
(#175) (5a70576)TemplateChatWrapper
(#175) (5a70576)
Shipped with llama.cpp
release b2329
To use the latest
llama.cpp
release available, runnpx --no node-llama-cpp download --release latest
. (learn more)