Skip to content

Commit

Permalink
v0.1
Browse files Browse the repository at this point in the history
  • Loading branch information
spignelon committed Aug 23, 2024
0 parents commit 8d455c4
Show file tree
Hide file tree
Showing 6 changed files with 1,123 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
build/
dist/
.venv/
674 changes: 674 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

178 changes: 178 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
# NoLlama

NoLlama is a terminal-based interface for interacting with large language models (LLMs) that you can't run locally on your laptop. Inspired by [Ollama](https://ollama.com/), NoLlama provides a streamlined experience for chatting with models like GPT-4o, GPT-4o-mini, Claude 3 haiku, Mixtral, LLaMA 70B, and more, directly from your terminal.

While Ollama offers a neat interface for running local LLMs, their performance and capabilities often fall short of these massive models. NoLlama bridges this gap by allowing you to interact with these powerful models using a lightweight terminal UI, complete with colorful markdown rendering, multiple model choices, and efficient memory usage.

![NoLlama](https://i.imgur.com/Py1qESW.png)

## Features

- **Multiple Model Choices:** Switch between various LLMs like GPT-4o, GPT-4o-mini, Mixtral, LLaMA 70B, Claude 3 haiku and more.
- **Neat Terminal UI:** Enjoy a clean and intuitive interface for your interactions.
- **Colorful Markdown Rendering:** Unlike Ollama, NoLlama supports rich text formatting in markdown.
- **Low Memory Usage:** Efficient memory management makes it lightweight compared to using a browser for similar tasks.
- **Easy Model Switching:** Simply type `model` in the chat to switch between models.
- **Clear Chat History:** Type `clear` to clear the chat history.
- **Exit Prompt:** Type `q`, `quit`, or `exit` to leave the chat.
- **Default Mode:** NoLlama runs in standard mode by default—just type `nollama` in the terminal to start.
- **Experimental Feature:** Enable live streaming of output with the `--stream` flag (unstable).
- **Anonymous and Private Usage:** Use `torsocks` to route all traffic through the Tor network for privacy.

## Installation

1. **Download the Binary:**

Download the latest binary from the [Releases](https://github.com/spignelon/nollama/releases) page.

2. **Move the Binary to `/usr/bin/`:**

After downloading, move the binary to `/usr/bin/` for easy access from anywhere in your terminal:

```bash
sudo mv nollama /usr/bin/
```

3. **Run NoLlama:**

Start NoLlama from the terminal by simply typing:

```bash
nollama
```

This will start NoLlama in the default mode.

## Building from Source

If you'd like to build NoLlama from source, follow these steps:
1. **Clone the Repository:**
```bash
git clone https://github.com/spignelon/nollama.git
cd nollama
```
2. **Install Dependencies:**
You can install the required dependencies using `pip`:
Creating a python virtual environment:
```bash
virtualenv .venv
source .venv/bin/activate
```
```bash
pip install -r requirements.txt
```
3. **Compile the Script (Optional):**
If you want to compile the script into a standalone executable, you can use PyInstaller:
First set `version_check: bool = False` in `.venv/lib/python3.12/site-packages/g4f/debug.py`
Then:
```bash
pyinstaller --onefile --name=nollama --collect-all readchar nollama.py
```
4. **Move the Executable to `/usr/bin/`:**
After compilation, move the binary to `/usr/bin/`:
```bash
sudo mv dist/nollama /usr/bin/nollama
```
5. **Run NoLlama:**
Start NoLlama by typing:
```bash
nollama
```
## Usage
- **Switch Models:** Type `model` in the chat to choose a different LLM.
- **Clear Chat:** Type `clear` to clear the chat history.
- **Exit:** Type `q`, `quit`, or `exit` to leave the chat.
- **Default Mode:** Run NoLlama without any flags for standard operation:
```bash
nollama
```
## Anonymous and Private Usage
For enhanced privacy and anonymity, you can use `torsocks` to route NoLlama's traffic through the Tor network. This ensures that all requests are anonymized and cannot be traced back to you.

### Step 1: Install Tor

#### Debian/Ubuntu:

```bash
sudo apt update
sudo apt install tor
```

#### Arch Linux:

```bash
sudo pacman -S tor
```

#### Fedora:

```bash
sudo dnf install tor
```

### Step 2: Enable and Start Tor

After installation, you need to enable and start the Tor service:

```bash
sudo systemctl enable tor
sudo systemctl start tor
```

### Step 3: Run NoLlama with Tor

Once Tor is running, you can use `torsocks` to run NoLlama anonymously:

```bash
torsocks nollama
```

This will ensure that all your interactions with NoLlama are routed through the Tor network, providing a layer of privacy and anonymity.

## Experimental Feature

- **Streaming Mode:**

NoLlama includes an experimental streaming mode that allows you to see responses as they are generated. This mode is currently unstable and may cause issues. To enable streaming, use the `--stream` flag:

```bash
nollama --stream
```

## Contribution

Contributions are welcome! If you have suggestions for new features or improvements, feel free to open an issue or submit a pull request.

## Acknowledgments

- **[g4f](https://pypi.org/project/g4f/):** Used for connecting to various LLMs.
- **[Python Rich](https://pypi.org/project/rich/):** Used for colorful markdown rendering and improved terminal UI.

## Disclaimer

NoLlama is not affiliated with Ollama. It is an independent project inspired by the concept of providing a neat terminal interface for interacting with language models, particularly those that are too large to run locally on typical consumer hardware or not available for self hosting.

## License

This project is licensed under the [GPL-3.0 License](LICENSE). <br>
[![GNU GPLv3 Image](https://www.gnu.org/graphics/gplv3-127x51.png)](https://www.gnu.org/licenses/gpl-3.0.en.html)
142 changes: 142 additions & 0 deletions nollama.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
import argparse
import sys
from rich.console import Console
from rich.markdown import Markdown
from rich.text import Text
import inquirer
from g4f.client import Client
from yaspin import yaspin

# Initialize the rich console
console = Console()

# Models dictionary
models = {
'gpt-4o-mini': 'gpt-4o-mini',
'claude-3-haiku' : 'claude-3-haiku',
'gpt-4o': 'gpt-4o',
'gpt-4': 'gpt-4',
'gpt-4-turbo': 'gpt-4-turbo',
'llama-3-70b-instruct': 'llama-3-70b-instruct',
'llama-3.1-70b': 'llama-3.1-70b',
'llama-3.1-70b-instruct': 'llama-3.1-70b-instruct',
'mixtral-8x7b': 'mixtral-8x7b',
'Nous-Hermes-2-Mixtral-8x7B-DPO': 'Nous-Hermes-2-Mixtral-8x7B-DPO',
'Yi-1.5-34b-chat': 'Yi-1.5-34b-chat',
'Phi-3-mini-4k-instruct': 'Phi-3-mini-4k-instruct',
'blackbox' : 'blackbox',
'Qwen2-7b-instruct' : 'Qwen2-7b-instruct',
'command-r+' : 'command-r+',
'SparkDesk-v1.1' : 'SparkDesk-v1.1',
'glm4-9b-chat' : 'glm4-9b-chat',
'chatglm3-6b' : 'chatglm3-6b',
}

# Initialize the g4f client
client = Client()

# Function to display the title and model
def display_title_and_model(selected_model):
title = Text("nollama", style="bold red underline")
model_text = Text(f"Model: {selected_model}", style="bold yellow")
console.print(title, justify="center")
console.print(model_text, justify="right")

# Function to select a model using inquirer
def select_model():
questions = [
inquirer.List('model',
message="Select a model",
choices=list(models.keys()))
]
answer = inquirer.prompt(questions)
return models[answer['model']]

# Function to handle asking a question
def ask_question(selected_model, stream):
# Prompt the user for input
question = input(">>> ").strip()

if not question:
console.print("[bold red]Error: Input is empty. Please type something.[/bold red]")
return selected_model

if question.lower() in ["quit", "exit", "q"]:
console.print("[bold red]Exiting the prompt...[/bold red]")
sys.exit()

if question.lower() == "clear":
console.clear()
display_title_and_model(selected_model)
return selected_model

if question.lower() == "model":
return select_model()

# Initialize a variable to collect the markdown content
markdown_content = ""

# Show spinner while waiting for response to start streaming
with yaspin(text="Waiting for response...", color="yellow") as spinner:
try:
chat_completion = client.chat.completions.create(
model=selected_model,
messages=[{"role": "user", "content": question}],
stream=stream
)

if stream:
# Process each chunk of the streamed response
first_chunk_received = False
for completion in chat_completion:
if not first_chunk_received:
spinner.stop() # Stop the spinner once the first chunk arrives
first_chunk_received = True

# Get the new text from the stream
chunk = completion.choices[0].delta.content or ""

# Append the new chunk to the markdown content
markdown_content += chunk

# Clear the console but preserve the prompt and question
console.clear()
display_title_and_model(selected_model) # Display the title and selected model
console.print(f">>> {question}") # Reprint the prompt and question
console.print(Markdown(markdown_content), end="") # Render the markdown content

else:
# If not streaming, get the complete response and print it at once
markdown_content = chat_completion.choices[0].message.content or ""
spinner.stop()
console.print(Markdown(markdown_content))

except Exception as e:
spinner.fail("Error occurred!")
console.print(f"[bold red]An error occurred: {e}[/bold red]")
return selected_model

# After response is complete, print a newline for separation
console.print()
return selected_model

# Main loop to keep asking questions
def main():
# Parse command-line arguments
parser = argparse.ArgumentParser(description="Run nollama with or without streaming output.")
parser.add_argument("--stream", action="store_true", help="Enable live streaming of the output.")
args = parser.parse_args()

selected_model = select_model()
console.clear()
display_title_and_model(selected_model)

try:
while True:
selected_model = ask_question(selected_model, stream=args.stream)
except KeyboardInterrupt:
console.print("\n[bold red]Exiting the prompt...[/bold red]")
sys.exit()

if __name__ == "__main__":
main()
45 changes: 45 additions & 0 deletions nollama.spec
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import collect_all

datas = []
binaries = []
hiddenimports = []
tmp_ret = collect_all('readchar')
datas += tmp_ret[0]; binaries += tmp_ret[1]; hiddenimports += tmp_ret[2]


a = Analysis(
['nollama.py'],
pathex=[],
binaries=binaries,
datas=datas,
hiddenimports=hiddenimports,
hookspath=[],
hooksconfig={},
runtime_hooks=[],
excludes=[],
noarchive=False,
optimize=0,
)
pyz = PYZ(a.pure)

exe = EXE(
pyz,
a.scripts,
a.binaries,
a.datas,
[],
name='nollama',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=True,
upx_exclude=[],
runtime_tmpdir=None,
console=True,
disable_windowed_traceback=False,
argv_emulation=False,
target_arch=None,
codesign_identity=None,
entitlements_file=None,
)
Loading

0 comments on commit 8d455c4

Please sign in to comment.