Skip to content

developer239/llama.cpp-ts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llama.cpp-ts 🦙

npm version

LlamaCPP-ts is a Node.js binding for the LlamaCPP library, which wraps the llama.cpp framework. It provides an easy-to-use interface for running language models in Node.js applications, supporting both synchronous queries and asynchronous streaming responses.

Supported Systems:

  • MacOS
  • Windows (not tested yet)
  • Linux (not tested yet)

Models

You can find some models here

Example is using this one Meta-Llama-3.1-8B-Instruct-Q3_K_S.gguf.

Installation

Ensure that you have CMake installed on your system:

  • On MacOS: brew install cmake
  • On Windows: choco install cmake
  • On Linux: sudo apt-get install cmake

Then, install the package:

npm install llama.cpp-ts
# or
yarn add llama.cpp-ts

Usage

Basic Usage

import { Llama } from 'llama.cpp-ts';

const llama = new Llama();
const initialized = llama.initialize('./path/to/your/model.gguf');

if (initialized) {
  const response: string = llama.runQuery("Tell me a story.", 100);
  console.log(response);
} else {
  console.error("Failed to initialize the model.");
}

Streaming Responses

import { Llama, TokenStream } from 'llama.cpp-ts';

async function main() {
  const llama = new Llama();
  const initialized: boolean = llama.initialize('./path/to/your/model.gguf');

  if (initialized) {
    const tokenStream: TokenStream = llama.runQueryStream("Explain quantum computing", 200);

    while (true) {
      const token: string | null = await tokenStream.read();
      if (token === null) break;
      process.stdout.write(token);
    }
  } else {
    console.error("Failed to initialize the model.");
  }
}

main().catch(console.error);

API Reference

Llama Class

The Llama class provides methods to interact with language models loaded through llama.cpp.

Public Methods

  • constructor(): Creates a new Llama instance.
  • initialize(modelPath: string, contextSize?: number): boolean: Initializes the model with the specified path and context size.
  • runQuery(prompt: string, maxTokens?: number): string: Runs a query with the given prompt and returns the result as a string.
  • runQueryStream(prompt: string, maxTokens?: number): TokenStream: Streams the response to the given prompt, returning a TokenStream object.

TokenStream Class

The TokenStream class represents a stream of tokens generated by the language model.

Public Methods

  • read(): Promise<string | null>: Reads the next token from the stream. Returns null when the stream is finished.