- Ha Noi
Pinned Loading
-
Speech_project_Vin
Speech_project_Vin PublicMultimodal Speech Emotion Recognition ViT (AST) for audio encoder and Multiscale Attention Net (MANet) for visual encoder
-
project_NLP_final
project_NLP_final PublicThis is a group project in the vin program: Modality Balance for Multimodal Conversational Emotion Recognition
-
-
-
LLaMA-Omni
LLaMA-Omni PublicForked from ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Python
-
mini-omni
mini-omni PublicForked from gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Python
If the problem persists, check the GitHub status page or contact support.