speech-language-model

Here are 7 public repositories matching this topic...

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech large-language-models multimodal-large-language-models speech-language-model speech-interaction

Updated Sep 24, 2024
Python

jishengpeng / WavTokenizer

Star

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

semantic text-to-speech codec acoustic dac speech-representation audio-representation encodec soundstream music-representation-learning gpt4o speech-language-model

Updated Oct 23, 2024
Python

zhenye234 / xcodec

Star

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

audio music semantic text-to-speech tokenizer speech sound codec audio-codec gpt language-model self-supervised-learning text-to-music vall-e text-to-sound speech-language-model

Updated Oct 1, 2024
Python

hhguo / SoCodec

Star

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

audio speech tts speech-codec speech-language-model

Updated Sep 20, 2024
Python

slp-rl / salmon

Star

The official code for the SALMon🍣 benchmark

audio-processing acoustic-model speech-language-model

Updated Sep 15, 2024
Python

lucadellalib / audiocodecs

Star

A collections of audio codecs with a standardized API

text-to-speech pytorch speech-synthesis codec quantization mimi dac self-supervised-learning encodec wavlm speech-coding speechtokenizer speech-language-model

Updated Nov 5, 2024
Python

jishengpeng / WavChat

Star

A Survey of Spoken Dialogue Models (60 pages)

streaming duplex speech moshi speech-representation encodec gpt-4o speech-language-model spoken-dialogue-models modal-alignment intreaction mini-omni llama-omni wavtokenizer

Updated Nov 11, 2024

Improve this page

Add a description, image, and links to the speech-language-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-language-model topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-language-model

Here are 7 public repositories matching this topic...

ictnlp / LLaMA-Omni

jishengpeng / WavTokenizer

zhenye234 / xcodec

hhguo / SoCodec

slp-rl / salmon

lucadellalib / audiocodecs

jishengpeng / WavChat

Improve this page

Add this topic to your repo