-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 8d455c4
Showing
6 changed files
with
1,123 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
build/ | ||
dist/ | ||
.venv/ |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
# NoLlama | ||
|
||
NoLlama is a terminal-based interface for interacting with large language models (LLMs) that you can't run locally on your laptop. Inspired by [Ollama](https://ollama.com/), NoLlama provides a streamlined experience for chatting with models like GPT-4o, GPT-4o-mini, Claude 3 haiku, Mixtral, LLaMA 70B, and more, directly from your terminal. | ||
|
||
While Ollama offers a neat interface for running local LLMs, their performance and capabilities often fall short of these massive models. NoLlama bridges this gap by allowing you to interact with these powerful models using a lightweight terminal UI, complete with colorful markdown rendering, multiple model choices, and efficient memory usage. | ||
|
||
![NoLlama](https://i.imgur.com/Py1qESW.png) | ||
|
||
## Features | ||
|
||
- **Multiple Model Choices:** Switch between various LLMs like GPT-4o, GPT-4o-mini, Mixtral, LLaMA 70B, Claude 3 haiku and more. | ||
- **Neat Terminal UI:** Enjoy a clean and intuitive interface for your interactions. | ||
- **Colorful Markdown Rendering:** Unlike Ollama, NoLlama supports rich text formatting in markdown. | ||
- **Low Memory Usage:** Efficient memory management makes it lightweight compared to using a browser for similar tasks. | ||
- **Easy Model Switching:** Simply type `model` in the chat to switch between models. | ||
- **Clear Chat History:** Type `clear` to clear the chat history. | ||
- **Exit Prompt:** Type `q`, `quit`, or `exit` to leave the chat. | ||
- **Default Mode:** NoLlama runs in standard mode by default—just type `nollama` in the terminal to start. | ||
- **Experimental Feature:** Enable live streaming of output with the `--stream` flag (unstable). | ||
- **Anonymous and Private Usage:** Use `torsocks` to route all traffic through the Tor network for privacy. | ||
|
||
## Installation | ||
|
||
1. **Download the Binary:** | ||
|
||
Download the latest binary from the [Releases](https://github.com/spignelon/nollama/releases) page. | ||
|
||
2. **Move the Binary to `/usr/bin/`:** | ||
|
||
After downloading, move the binary to `/usr/bin/` for easy access from anywhere in your terminal: | ||
|
||
```bash | ||
sudo mv nollama /usr/bin/ | ||
``` | ||
|
||
3. **Run NoLlama:** | ||
|
||
Start NoLlama from the terminal by simply typing: | ||
|
||
```bash | ||
nollama | ||
``` | ||
|
||
This will start NoLlama in the default mode. | ||
|
||
## Building from Source | ||
|
||
If you'd like to build NoLlama from source, follow these steps: | ||
1. **Clone the Repository:** | ||
```bash | ||
git clone https://github.com/spignelon/nollama.git | ||
cd nollama | ||
``` | ||
2. **Install Dependencies:** | ||
You can install the required dependencies using `pip`: | ||
Creating a python virtual environment: | ||
```bash | ||
virtualenv .venv | ||
source .venv/bin/activate | ||
``` | ||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
3. **Compile the Script (Optional):** | ||
If you want to compile the script into a standalone executable, you can use PyInstaller: | ||
First set `version_check: bool = False` in `.venv/lib/python3.12/site-packages/g4f/debug.py` | ||
Then: | ||
```bash | ||
pyinstaller --onefile --name=nollama --collect-all readchar nollama.py | ||
``` | ||
4. **Move the Executable to `/usr/bin/`:** | ||
After compilation, move the binary to `/usr/bin/`: | ||
```bash | ||
sudo mv dist/nollama /usr/bin/nollama | ||
``` | ||
5. **Run NoLlama:** | ||
Start NoLlama by typing: | ||
```bash | ||
nollama | ||
``` | ||
## Usage | ||
- **Switch Models:** Type `model` in the chat to choose a different LLM. | ||
- **Clear Chat:** Type `clear` to clear the chat history. | ||
- **Exit:** Type `q`, `quit`, or `exit` to leave the chat. | ||
- **Default Mode:** Run NoLlama without any flags for standard operation: | ||
```bash | ||
nollama | ||
``` | ||
## Anonymous and Private Usage | ||
For enhanced privacy and anonymity, you can use `torsocks` to route NoLlama's traffic through the Tor network. This ensures that all requests are anonymized and cannot be traced back to you. | ||
|
||
### Step 1: Install Tor | ||
|
||
#### Debian/Ubuntu: | ||
|
||
```bash | ||
sudo apt update | ||
sudo apt install tor | ||
``` | ||
|
||
#### Arch Linux: | ||
|
||
```bash | ||
sudo pacman -S tor | ||
``` | ||
|
||
#### Fedora: | ||
|
||
```bash | ||
sudo dnf install tor | ||
``` | ||
|
||
### Step 2: Enable and Start Tor | ||
|
||
After installation, you need to enable and start the Tor service: | ||
|
||
```bash | ||
sudo systemctl enable tor | ||
sudo systemctl start tor | ||
``` | ||
|
||
### Step 3: Run NoLlama with Tor | ||
|
||
Once Tor is running, you can use `torsocks` to run NoLlama anonymously: | ||
|
||
```bash | ||
torsocks nollama | ||
``` | ||
|
||
This will ensure that all your interactions with NoLlama are routed through the Tor network, providing a layer of privacy and anonymity. | ||
|
||
## Experimental Feature | ||
|
||
- **Streaming Mode:** | ||
|
||
NoLlama includes an experimental streaming mode that allows you to see responses as they are generated. This mode is currently unstable and may cause issues. To enable streaming, use the `--stream` flag: | ||
|
||
```bash | ||
nollama --stream | ||
``` | ||
|
||
## Contribution | ||
|
||
Contributions are welcome! If you have suggestions for new features or improvements, feel free to open an issue or submit a pull request. | ||
|
||
## Acknowledgments | ||
|
||
- **[g4f](https://pypi.org/project/g4f/):** Used for connecting to various LLMs. | ||
- **[Python Rich](https://pypi.org/project/rich/):** Used for colorful markdown rendering and improved terminal UI. | ||
|
||
## Disclaimer | ||
|
||
NoLlama is not affiliated with Ollama. It is an independent project inspired by the concept of providing a neat terminal interface for interacting with language models, particularly those that are too large to run locally on typical consumer hardware or not available for self hosting. | ||
|
||
## License | ||
|
||
This project is licensed under the [GPL-3.0 License](LICENSE). <br> | ||
[![GNU GPLv3 Image](https://www.gnu.org/graphics/gplv3-127x51.png)](https://www.gnu.org/licenses/gpl-3.0.en.html) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
import argparse | ||
import sys | ||
from rich.console import Console | ||
from rich.markdown import Markdown | ||
from rich.text import Text | ||
import inquirer | ||
from g4f.client import Client | ||
from yaspin import yaspin | ||
|
||
# Initialize the rich console | ||
console = Console() | ||
|
||
# Models dictionary | ||
models = { | ||
'gpt-4o-mini': 'gpt-4o-mini', | ||
'claude-3-haiku' : 'claude-3-haiku', | ||
'gpt-4o': 'gpt-4o', | ||
'gpt-4': 'gpt-4', | ||
'gpt-4-turbo': 'gpt-4-turbo', | ||
'llama-3-70b-instruct': 'llama-3-70b-instruct', | ||
'llama-3.1-70b': 'llama-3.1-70b', | ||
'llama-3.1-70b-instruct': 'llama-3.1-70b-instruct', | ||
'mixtral-8x7b': 'mixtral-8x7b', | ||
'Nous-Hermes-2-Mixtral-8x7B-DPO': 'Nous-Hermes-2-Mixtral-8x7B-DPO', | ||
'Yi-1.5-34b-chat': 'Yi-1.5-34b-chat', | ||
'Phi-3-mini-4k-instruct': 'Phi-3-mini-4k-instruct', | ||
'blackbox' : 'blackbox', | ||
'Qwen2-7b-instruct' : 'Qwen2-7b-instruct', | ||
'command-r+' : 'command-r+', | ||
'SparkDesk-v1.1' : 'SparkDesk-v1.1', | ||
'glm4-9b-chat' : 'glm4-9b-chat', | ||
'chatglm3-6b' : 'chatglm3-6b', | ||
} | ||
|
||
# Initialize the g4f client | ||
client = Client() | ||
|
||
# Function to display the title and model | ||
def display_title_and_model(selected_model): | ||
title = Text("nollama", style="bold red underline") | ||
model_text = Text(f"Model: {selected_model}", style="bold yellow") | ||
console.print(title, justify="center") | ||
console.print(model_text, justify="right") | ||
|
||
# Function to select a model using inquirer | ||
def select_model(): | ||
questions = [ | ||
inquirer.List('model', | ||
message="Select a model", | ||
choices=list(models.keys())) | ||
] | ||
answer = inquirer.prompt(questions) | ||
return models[answer['model']] | ||
|
||
# Function to handle asking a question | ||
def ask_question(selected_model, stream): | ||
# Prompt the user for input | ||
question = input(">>> ").strip() | ||
|
||
if not question: | ||
console.print("[bold red]Error: Input is empty. Please type something.[/bold red]") | ||
return selected_model | ||
|
||
if question.lower() in ["quit", "exit", "q"]: | ||
console.print("[bold red]Exiting the prompt...[/bold red]") | ||
sys.exit() | ||
|
||
if question.lower() == "clear": | ||
console.clear() | ||
display_title_and_model(selected_model) | ||
return selected_model | ||
|
||
if question.lower() == "model": | ||
return select_model() | ||
|
||
# Initialize a variable to collect the markdown content | ||
markdown_content = "" | ||
|
||
# Show spinner while waiting for response to start streaming | ||
with yaspin(text="Waiting for response...", color="yellow") as spinner: | ||
try: | ||
chat_completion = client.chat.completions.create( | ||
model=selected_model, | ||
messages=[{"role": "user", "content": question}], | ||
stream=stream | ||
) | ||
|
||
if stream: | ||
# Process each chunk of the streamed response | ||
first_chunk_received = False | ||
for completion in chat_completion: | ||
if not first_chunk_received: | ||
spinner.stop() # Stop the spinner once the first chunk arrives | ||
first_chunk_received = True | ||
|
||
# Get the new text from the stream | ||
chunk = completion.choices[0].delta.content or "" | ||
|
||
# Append the new chunk to the markdown content | ||
markdown_content += chunk | ||
|
||
# Clear the console but preserve the prompt and question | ||
console.clear() | ||
display_title_and_model(selected_model) # Display the title and selected model | ||
console.print(f">>> {question}") # Reprint the prompt and question | ||
console.print(Markdown(markdown_content), end="") # Render the markdown content | ||
|
||
else: | ||
# If not streaming, get the complete response and print it at once | ||
markdown_content = chat_completion.choices[0].message.content or "" | ||
spinner.stop() | ||
console.print(Markdown(markdown_content)) | ||
|
||
except Exception as e: | ||
spinner.fail("Error occurred!") | ||
console.print(f"[bold red]An error occurred: {e}[/bold red]") | ||
return selected_model | ||
|
||
# After response is complete, print a newline for separation | ||
console.print() | ||
return selected_model | ||
|
||
# Main loop to keep asking questions | ||
def main(): | ||
# Parse command-line arguments | ||
parser = argparse.ArgumentParser(description="Run nollama with or without streaming output.") | ||
parser.add_argument("--stream", action="store_true", help="Enable live streaming of the output.") | ||
args = parser.parse_args() | ||
|
||
selected_model = select_model() | ||
console.clear() | ||
display_title_and_model(selected_model) | ||
|
||
try: | ||
while True: | ||
selected_model = ask_question(selected_model, stream=args.stream) | ||
except KeyboardInterrupt: | ||
console.print("\n[bold red]Exiting the prompt...[/bold red]") | ||
sys.exit() | ||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# -*- mode: python ; coding: utf-8 -*- | ||
from PyInstaller.utils.hooks import collect_all | ||
|
||
datas = [] | ||
binaries = [] | ||
hiddenimports = [] | ||
tmp_ret = collect_all('readchar') | ||
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2] | ||
|
||
|
||
a = Analysis( | ||
['nollama.py'], | ||
pathex=[], | ||
binaries=binaries, | ||
datas=datas, | ||
hiddenimports=hiddenimports, | ||
hookspath=[], | ||
hooksconfig={}, | ||
runtime_hooks=[], | ||
excludes=[], | ||
noarchive=False, | ||
optimize=0, | ||
) | ||
pyz = PYZ(a.pure) | ||
|
||
exe = EXE( | ||
pyz, | ||
a.scripts, | ||
a.binaries, | ||
a.datas, | ||
[], | ||
name='nollama', | ||
debug=False, | ||
bootloader_ignore_signals=False, | ||
strip=False, | ||
upx=True, | ||
upx_exclude=[], | ||
runtime_tmpdir=None, | ||
console=True, | ||
disable_windowed_traceback=False, | ||
argv_emulation=False, | ||
target_arch=None, | ||
codesign_identity=None, | ||
entitlements_file=None, | ||
) |
Oops, something went wrong.