Audioneme

AI model for child speech disorder detection. We finetune wav2vec2 for this binary classification task.

Installation

Clone the repository:

git clone https://github.com/keshavbhandari/Audioneme.git

We recommend working from a clean environment, e.g. using conda:

conda create --name audioneme python=3.9
source activate audioneme

Install dependencies :

cd Audioneme
pip install -r requirements.txt
pip install -e .

Usage

Loading Data

Unfortunately, data cannot be shared as it is not public yet. However, this framework can be used on any speech dataset. The data loader should spit out audio signal, encoded transcription of the audio, encoded filename (used for further analysis) and a binary target.

from scripts.data_loader import train_loader, val_loader, test_loader

# Audio Data, Transcription, Filename, Binary Target
for batch_idx, (data, target) in enumerate(train_loader):
    print(data[0].shape, data[1].shape, data[2].shape, target.shape)
    break

Train Model

Train the speech recognition model on wav2vec2. Check configs first to ensure parameters and model type is correct.

python scripts.train.py

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
configs		configs
scripts		scripts
src		src
trained_models		trained_models
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audioneme

Installation

Usage

Loading Data

Train Model

About

Releases

Packages

Languages

License

keshavbhandari/Audioneme

Folders and files

Latest commit

History

Repository files navigation

Audioneme

Installation

Usage

Loading Data

Train Model

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages