Fixup Readme

tjake · Sep 22, 2024 · e190ac2 · e190ac2
1 parent adf7db0
commit e190ac2
Showing 1 changed file with 23 additions and 22 deletions.
diff --git a/README.md b/README.md
@@ -39,33 +39,18 @@ Add LLM Inference directly to your Java application.
 
 ## 🔬 Quick Start
 
-### 🕵️‍♀️ How to use as a local client
+### 🕵️‍♀️ How to use as a local client (with jbang!)
 Jlama includes a command line tool that makes it easy to use.
+
 The CLI can be run with [jbang](https://www.jbang.dev/download/).
 
 ```shell
-#Install jbang (if you don't have it)
+#Install jbang (or https://www.jbang.dev/download/)
 curl -Ls https://sh.jbang.dev | bash -s - app setup
 
 #Install Jlama CLI (will ask if you trust the source)
-jbang app install -j 21 --name=jlama --force https://raw.githubusercontent.com/tjake/Jlama/main/jlama.java
-
-#Run the CLI
-jlama
-
-Usage: jlama [COMMAND]
-Jlama is a modern LLM inference engine for Java!
+jbang app install --force jlama@tjake
 
-Quantized models are maintained at https://hf.co/tjake
-
-Commands:
-  download             Downloads a HuggingFace model - use owner/name format
-  quantize             Quantize the specified model
-  chat                 Interact with the specified model
-  complete             Completes a prompt using the specified model
-  restapi              Starts a openai compatible rest api for interacting with this model
-  cluster-coordinator  Starts a distributed rest api for a model using cluster workers
-  cluster-worker       Connects to a cluster coordinator to perform distributed inference
 ```
 
 Now that you have jlama installed you can download a model from huggingface and chat with it.
@@ -77,16 +62,32 @@ jlama download tjake/TinyLlama-1.1B-Chat-v1.0-Jlama-Q4
 
 # Run the openai chat api and UI on this model
 jlama restapi models/TinyLlama-1.1B-Chat-v1.0-Jlama-Q4
-
-#Open browser to http://localhost:8080/
-open http://localhost:8080
 ```
+
 open browser to http://localhost:8080/
 
 <p align="center">
   <img src="docs/demo.png" alt="Demo chat">
 </p>
 
+
+```shell
+Usage: jlama [COMMAND]
+Jlama is a modern LLM inference engine for Java!
+
+Quantized models are maintained at https://hf.co/tjake
+
+Commands:
+  download             Downloads a HuggingFace model - use owner/name format
+  quantize             Quantize the specified model
+  chat                 Interact with the specified model
+  complete             Completes a prompt using the specified model
+  restapi              Starts a openai compatible rest api for interacting with this model
+  cluster-coordinator  Starts a distributed rest api for a model using cluster workers
+  cluster-worker       Connects to a cluster coordinator to perform distributed inference
+```
+
+
 ### 👨‍💻 How to use in your Java project
 The main purpose of Jlama is to provide a simple way to use large language models in Java.