parsavares/wav2vec2-base-luxembourgish-STT: A Luxembourgish ASR Model

Overview

This model utilizes the wav2vec 2.0 architecture, initially pre-trained on 842 hours of unlabeled Luxembourgish speech data from RTL.lu, followed by fine-tuning on 4 hours of labeled speech from the same domain. Designed to improve automatic speech recognition (ASR) for Luxembourgish, this effort aims to bridge the digital resource gap for the Luxembourgish language, making it more accessible for speech-based applications.

Model Description

Chosen for its robust performance on speech data, especially where labeled examples are scarce, the wav2vec 2.0 base model was first pre-trained on a large corpus of Luxembourgish speech. It was then fine-tuned with a smaller, annotated dataset specifically for speech recognition tasks. This approach was intended to refine the model's capability to accurately transcribe Luxembourgish speech.

Performance Metrics

Metric	Dev Set	Test Set
WER	23.95%	23.09%
CER	7.97%	7.63%

Intended Uses & Limitations

Targeted at researchers, developers, and companies interested in integrating Luxembourgish speech recognition into their services, the model marks a significant advance in Luxembourgish ASR technology. However, its efficacy may vary with the accent, specific jargon, and ambient noise in the audio input.

Training and Evaluation Data

Additional details on the pre-training and fine-tuning data sets would enrich understanding and facilitate reproduction of results.

Training Procedure

Hyperparameters

Hyperparameter	Value
Learning rate	7.5e-05
Batch size (train/eval)	3
Seed	42
Gradient accumulation steps	4
Total train batch size	12
Optimizer	Adam (betas=(0.9,0.999), epsilon=1e-08)
LR scheduler	Linear, with 2000 warmup steps
Epochs	50
Mixed precision training	Native AMP

Software and Libraries

Software/Library	Version
Transformers	4.20.0.dev0
PyTorch	1.11.0+cu113
Datasets	2.2.1
Tokenizers	0.12.1

Visualization

(Graph of training loss over epochs and comparison of WER and CER on Dev vs. Test datasets to be added here)

Citation

@misc{lb-wav2vec2,
  author = {Nguyen, Le Minh and Nayak, Shekhar and Coler, Matt.},
  keywords = {Luxembourgish, multilingual speech recognition, language modelling, wav2vec 2.0 XLSR-53, under-resourced language},
  title = {IMPROVING LUXEMBOURGISH SPEECH RECOGNITION WITH CROSS-LINGUAL SPEECH REPRESENTATIONS},
  year = {2022},
  copyright = {2023 IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitattributes		.gitattributes
Improving Luxembourgish Speech Recognition with Cross-Lingual Speech Representations - MA_s4923723_LM_Nguyen.pdf		Improving Luxembourgish Speech Recognition with Cross-Lingual Speech Representations - MA_s4923723_LM_Nguyen.pdf
Improving Luxembourgish Speech Recognition.pdf		Improving Luxembourgish Speech Recognition.pdf
Lemswasabi_tuudle_rtl-benchmark_test_eval_results.txt		Lemswasabi_tuudle_rtl-benchmark_test_eval_results.txt
Lemswasabi_tuudle_rtl-benchmark_validation_eval_results.txt		Lemswasabi_tuudle_rtl-benchmark_validation_eval_results.txt
README.md		README.md
config.json		config.json
luxembourgish-asr-rtl-lu.py		luxembourgish-asr-rtl-lu.py
preprocessor_config.json		preprocessor_config.json
pytorch_model.bin		pytorch_model.bin
special_tokens_map.json		special_tokens_map.json
stt.pdf		stt.pdf
tokenizer_config.json		tokenizer_config.json
vocab.json		vocab.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

parsavares/wav2vec2-base-luxembourgish-STT: A Luxembourgish ASR Model

Overview

Model Description

Performance Metrics

Intended Uses & Limitations

Training and Evaluation Data

Training Procedure

Hyperparameters

Software and Libraries

Visualization

Citation

About

Releases

Packages

Languages

parsavares/Luxembourgish-STT-TTS-DOC

Folders and files

Latest commit

History

Repository files navigation

parsavares/wav2vec2-base-luxembourgish-STT: A Luxembourgish ASR Model

Overview

Model Description

Performance Metrics

Intended Uses & Limitations

Training and Evaluation Data

Training Procedure

Hyperparameters

Software and Libraries

Visualization

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages