audio-dataset-converter-faster-whisper

Adds support for transcribing audio files (.wav, .mp3) using faster-whisper.

Installation

pip install git+https://github.com/waikato-llm/audio-dataset-converter.git
pip install git+https://github.com/waikato-llm/audio-dataset-converter-faster-whisper.git

Tools

Generating SRT subtitles

usage: adc-srt [-h] -i FILE [FILE ...] [-o DIR] [-m MODEL_SIZE] [-d DEVICE]
               [-c COMPUTE_TYPE] [-b BEAM_SIZE] [-u UPDATE_INTERVAL]
               [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]

Tool for generating SRT subtitle files from video/audio files.

optional arguments:
  -h, --help            show this help message and exit
  -i FILE [FILE ...], --input FILE [FILE ...]
                        The audio/video files to process; supports glob
                        syntax. (default: None)
  -o DIR, --output DIR  The directory to store the generated subtitle files
                        in; places them in the same locations as the input
                        files if not provided. (default: None)
  -m MODEL_SIZE, --model_size MODEL_SIZE
                        The size of the whisper model to use, e.g., 'base' or
                        'large-v3' (default: base)
  -d DEVICE, --device DEVICE
                        The device to run on, e.g., 'cuda' or 'cpu' (default:
                        cpu)
  -c COMPUTE_TYPE, --compute_type COMPUTE_TYPE
                        The compute type to use, e.g., 'float16' or 'int8'
                        (default: int8)
  -b BEAM_SIZE, --beam_size BEAM_SIZE
                        The beam size to use for decoding (default: 5)
  -u UPDATE_INTERVAL, --update_interval UPDATE_INTERVAL
                        The number of segments when to output info logging
                        messages during processing (default: 100)
  -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --logging_level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        The logging level to use. (default: WARN)

Plugins

See here for an overview of all plugins.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
plugins		plugins
src/adc_faster_whisper		src/adc_faster_whisper
.gitignore		.gitignore
CHANGES.rst		CHANGES.rst
DESCRIPTION.rst		DESCRIPTION.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
RELEASE.md		RELEASE.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audio-dataset-converter-faster-whisper

Installation

Tools

Generating SRT subtitles

Plugins

About

Releases 1

Languages

License

waikato-llm/audio-dataset-converter-faster-whisper

Folders and files

Latest commit

History

Repository files navigation

audio-dataset-converter-faster-whisper

Installation

Tools

Generating SRT subtitles

Plugins

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages