Speechbrain emotion recognition with OpenVINIO (#2064)

OpenVINO Jupyter notebook demonstrating SpeechBrain emotion recognition with OpenVINO. --------- Co-authored-by: Ekaterina Aidova <ekaterina.aidova@intel.com>
openvinotoolkit · Jun 12, 2024 · 1985983 · 1985983
1 parent c30137b
commit 1985983
Show file tree

Hide file tree

Showing 5 changed files with 501 additions and 1 deletion.
diff --git a/.ci/ignore_treon_docker.txt b/.ci/ignore_treon_docker.txt
@@ -67,3 +67,4 @@ notebooks/stable-cascade-image-generation/stable-cascade-image-generation.ipynb
 notebooks/dynamicrafter-animating-images/dynamicrafter-animating-images.ipynb
 notebooks/yolov10-optimization/yolov10-optimization.ipynb
 notebooks/whisper-subtitles-generation/whisper-subtitles-generation.ipynb
+notebooks/speechbrain-emotion-recognition/speechbrain-emotion-recognition.ipynb
diff --git a/.ci/ignore_treon_py38.txt b/.ci/ignore_treon_py38.txt
@@ -1,2 +1,3 @@
 notebooks/surya-line-level-text-detection/surya-line-level-text-detection.ipynb
-notebooks/stable-diffusion-keras-cv/stable-diffusion-keras-cv.ipynb
+notebooks/stable-diffusion-keras-cv/stable-diffusion-keras-cv.ipynb
+notebooks/speechbrain-emotion-recognition/speechbrain-emotion-recognition.ipynb
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -86,9 +86,11 @@ CHW
 Cifar
 cityscape
 Cityscapes
+classname
 CLI
 cli
 CLIP's
+codebase
 codebook
 codebooks
 codec
@@ -127,6 +129,7 @@ CRNN
 CSV
 CTC
 CUDA
+CustomEncoderWav
 CVF
 CVPR
 Databricks
@@ -294,6 +297,7 @@ HWC
 hyperparameters
 ICIP
 ICPR
+IEMOCAP
 iGPU
 IdentityNet
 iGPUs
@@ -592,6 +596,7 @@ PTQ
 px
 py
 pyannote
+pymodule
 PyPI
 Pythia
 pytorch
@@ -710,6 +715,9 @@ sparsified
 sparsify
 spectrogram
 spectrograms
+SpeechBrain
+SpeechBrain's
+speechbrain
 splitters
 SPS
 SQA
@@ -848,6 +856,7 @@ VQVAE
 waveform
 waveforms
 Wav
+wav
 WavLM
 WebGL
 WebUI

diff --git a/notebooks/speechbrain-emotion-recognition/README.md b/notebooks/speechbrain-emotion-recognition/README.md
@@ -0,0 +1,23 @@
+# SpeechBrain Emotion Recognition with OpenVINO
+
+[SpeechBrain](https://github.com/speechbrain/speechbrain) is an open-source PyTorch toolkit that accelerates Conversational AI development, i.e., the technology behind speech assistants, chatbots, and large language models. It is crafted for fast and easy creation of advanced technologies for Speech and Text Processing.
+
+Lear more in [GitHub repo](https://github.com/speechbrain/speechbrain) and [paper](https://arxiv.org/pdf/2106.04624)
+
+## Notebook contents
+The tutorial consists from following steps:
+- Installations
+- Imports
+- Prepare base model
+- Initialize model
+- PyTorch inference
+- Initialize model
+- SpeechBrain model optimization with Intel OpenVINO
+    - Step 1: Prepare input tensor
+    - Step 2: Convert model to OpenVINO IR
+    - Step 3: OpenVINO model inference
+
+## Installation instructions
+This is a self-contained example that relies solely on its own code.</br>
+We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
+For details, please refer to [Installation Guide](../../README.md).