Release 0.1 - First Steps · lhotse-speech/lhotse

”The journey of a thousand miles begins with one step.” – Lao Tzu

The first official release of Lhotse! It provides a solid base to build speech research and applications upon, by treating speech and audio data as a first-class citizen in the ML world.

Lhotse is going to continue to evolve, and some API changes might still happen.

Highlights:

audio-specific data model with Recording, Supervision, Features, and Cut manifests
integration with PyTorch for task-specific Dataset classes and Torchaudio for feature extraction
built-in data preparation for 8 speech corpora, including Librispeech, Switchboard, AMI, and TED-LIUM v3
intuitive interfaces that work well with interactive environments such as Jupyter notebooks for data visualisation
on-the-fly or pre-computed feature extraction and data augmentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.1 - First Steps