Skip to content

Commit

Permalink
Fixup Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
tjake committed Sep 22, 2024
1 parent adf7db0 commit e190ac2
Showing 1 changed file with 23 additions and 22 deletions.
45 changes: 23 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,33 +39,18 @@ Add LLM Inference directly to your Java application.

## 🔬 Quick Start

### 🕵️‍♀️ How to use as a local client
### 🕵️‍♀️ How to use as a local client (with jbang!)
Jlama includes a command line tool that makes it easy to use.

The CLI can be run with [jbang](https://www.jbang.dev/download/).

```shell
#Install jbang (if you don't have it)
#Install jbang (or https://www.jbang.dev/download/)
curl -Ls https://sh.jbang.dev | bash -s - app setup

#Install Jlama CLI (will ask if you trust the source)
jbang app install -j 21 --name=jlama --force https://raw.githubusercontent.com/tjake/Jlama/main/jlama.java

#Run the CLI
jlama

Usage: jlama [COMMAND]
Jlama is a modern LLM inference engine for Java!
jbang app install --force jlama@tjake

Quantized models are maintained at https://hf.co/tjake

Commands:
download Downloads a HuggingFace model - use owner/name format
quantize Quantize the specified model
chat Interact with the specified model
complete Completes a prompt using the specified model
restapi Starts a openai compatible rest api for interacting with this model
cluster-coordinator Starts a distributed rest api for a model using cluster workers
cluster-worker Connects to a cluster coordinator to perform distributed inference
```

Now that you have jlama installed you can download a model from huggingface and chat with it.
Expand All @@ -77,16 +62,32 @@ jlama download tjake/TinyLlama-1.1B-Chat-v1.0-Jlama-Q4

# Run the openai chat api and UI on this model
jlama restapi models/TinyLlama-1.1B-Chat-v1.0-Jlama-Q4

#Open browser to http://localhost:8080/
open http://localhost:8080
```

open browser to http://localhost:8080/

<p align="center">
<img src="docs/demo.png" alt="Demo chat">
</p>


```shell
Usage: jlama [COMMAND]
Jlama is a modern LLM inference engine for Java!

Quantized models are maintained at https://hf.co/tjake

Commands:
download Downloads a HuggingFace model - use owner/name format
quantize Quantize the specified model
chat Interact with the specified model
complete Completes a prompt using the specified model
restapi Starts a openai compatible rest api for interacting with this model
cluster-coordinator Starts a distributed rest api for a model using cluster workers
cluster-worker Connects to a cluster coordinator to perform distributed inference
```


### 👨‍💻 How to use in your Java project
The main purpose of Jlama is to provide a simple way to use large language models in Java.

Expand Down

0 comments on commit e190ac2

Please sign in to comment.