Skip to content

Latest commit

 

History

History
186 lines (156 loc) · 8.21 KB

README.md

File metadata and controls

186 lines (156 loc) · 8.21 KB

Mangrove

Mangrove is the backend module of Estuary, a framework for building multimodal real-time Socially Intelligent Agents (SIAs).

Give us a Star! ⭐

If you find Estuary helpful, please give us a star! Your support means a lot! If you find any bugs or would like to request a new feature, feel free to open an issue!

Supported Endpoints

Speech-To-Text (STT/ASR)

  • Faster-Whisper

Large Language Models (LLMs)

  • ChatGPT
  • Ollama

Text-To-Speech (TTS)

  • ElevenLabs
  • XTTS-v2
  • Google gTTS
  • pyttsx3

Setup

Environment Setup

  1. [WSL Ubuntu 22.04] Currently, Mangrove is tested to work in WSL Ubuntu 22.04. To install WSL, follow this official guide from Microsoft.
  2. [Updating WSL] Run sudo apt update and sudo apt upgrade in WSL.
  3. [Installing pipx] Run sudo apt install pipx in WSL.
  4. [Installing pdm] Run pipx install pdm in WSL.
  5. [Installing Conda] Refer to the Miniconda installation guide.

Installing Dependencies

  1. Run the following command to install packages:
    sudo apt-get install libcairo2-dev pulseaudio portaudio19-dev libgirepository1.0-dev libespeak-dev sox ffmpeg gstreamer-1.0 clang
  2. Open a powershell terminal window and restart your WSL shell (some packages require a restart to finish installation)
    wsl --shutdown
  3. Clone this repository into your WSL environment and navigate into it
    git clone https://github.com/estuary-ai/mangrove.git
    cd mangrove
  4. Create a Python 3.9.19 virtual environment with Conda:
    conda create -n mangrove python=3.9.19
    conda activate mangrove
  5. Enter the command pdm use and select the correct Python interpreter to use e.g. /home/username/miniconda3/envs/mangrove/bin/python
  6. Install Python dependencies.
    pdm install -G :all

Congrats! This is the end of the initial installation for Mangrove. Please refer to the next section for running Mangrove for the first time!

Running Mangrove for the First Time

Initial Steps

  1. Navigate to the Mangrove root directory.
    cd mangrove
  2. Activate the Conda virtual environment that was previously set up.
    conda activate mangrove

Selecting an LLM

  • ChatGPT: Refer to the API Keys section below for set up if you would like to use OpenAI
    • Flag: --bot_endpoint openai
  • Ollama: If you would like to use offline LLMs and have the VRAM to run them, you may consult the Ollama section for set up instructions.
    • Flag: --bot_endpoint ollama

Selecting a TTS module

  • XTTS: This is a popular offline TTS module that produces both high quality results and is performant at runtime. You can refer to the XTTS section for set up instructions.
    • Flag: --tts_endpoint xtts
  • gTTS: This is a free cloud-based TTS module offered by Google.
    • Flag: --tts_endpoint gtts

Other Configurations

  • You may specify which Port number you would like to use with the --port flag.
  • You may use CPU for processing with the --cpu flag.

Example Commands

  • Default run command which uses OpenAI and ElevenLabs and port 4000:
    python server.py
  • Example run command which uses the above flags:
    python server.py --bot_endpoint ollama --tts_endpoint xtts --port 4000

Connecting a Client

  • Python Client: This option is recommended for Python projects or for quick debugging purposes.
    • Navigate to the client/python directory.
      cd client/python/
    • Run the following command to start the client on port 4000:
      python client.py
    • You may also specify the address and port for the client to connect to with the --address and --port flags.
  • Unity Client: If you are building a Unity application, refer to the Estuary Unity SDK Documentation.

Further Setup as Required

API Keys

  • Mangrove supports the usage of APIs (e.g., OpenAI), which require API keys. Create .env file in the root directory of the project and add your API keys as follows:
    OPENAI_API_KEY=[your OpenAI API Key]
    ELEVENLABS_API_KEY=[your ElevenLabs API Key]

Ollama

  • Install Ollama inside of wsl by running the command:
    curl -fsSL https://ollama.com/install.sh | sh
  • Install an LLM from Ollama's model library e.g.
    ollama run nemotron-mini

XTTS

  • Running XTTS (using Deepspeed) requires a standlone version of cuda library (the same version as the one used by torch.version.cuda):
    1. Install dkms package to avoid issues with the installation of the cuda library: sudo apt-get install dkms
    2. Install CUDA 12.1 from the NVIDIA website.
    3. Follow the instructions given by the installation process including installing the driver.
      sudo sh cuda_12.1.0_530.30.02_linux.run --silent --driver
    4. Add the following to the .bashrc file with any code editor ie. nano ~/.bashrc
      export PATH=/usr/local/cuda-12.1/bin${PATH:+:${PATH}}
      export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
    5. Add a 6s-30s voice training clip to the root of the project directory. Make sure to name it speaker.wav.
    6. Make sure to restart WSL afterwards with wsl --shutdown in Powershell.

Networked Configuration

If you are running Mangrove in WSL and would like to configure Local Area Network (LAN) communications for a remote client, WSL must be set to mirrored network configuration. You can do this with the following steps:

  1. Open Powershell and create/open the .wslconfig file in the C:\Users\[username]\ directory.
  2. Add the following to the .wslconfig file:
[wsl2]
networkingMode=mirrored
[experimental] 
dnsTunneling=true
autoProxy=true
hostAddressLoopback=true
  1. Add an inbound network rule in Windows Security Settings > Firewall & Network Protection > Advanced Settings > Inbound Rules > New Rule...
    • Port > TCP, Specific local ports: 4000 > Allow the connection > Check: Domain, Private, Public > Name: Mangrove

Tips

  • Ensure both Mangrove and the client are connected to the same LAN and both the machine running Mangrove and the LAN allow for device-to-device communications.
  • Try restarting after applying the above Network Configurations if they do not initially work
  • [OPTIONAL] You may refer to the Microsoft WSL documentation on Mirrored Networking here.

Acknowledgements

Mangrove was built from our base code of developing Traveller, the digital assistant of SENVA, a prototype Augmented Reality (AR) Heads-Up Display (HUD) solution for astronauts. Thank you to Team Aegis for participating in the NASA SUITs Challenge for the following years:

  • 2023: University of Southern California (USC) with University of Berkley (UCBerkley)

  • 2022: University of Southern California (USC) with University of Arizona (UA).

The Estuary team would also like to acknowledge the developers, authors, and creatives whose work contributed to the success of this project:

More to come soon! Stay tuned and Fight On!