Releases · pszemraj/vid2cleantxt

12 Oct 00:37

pszemraj

v0.2.5

fee7375

whisper 🤫 Latest

Latest

Add support for OpenAI's whisper model through transformers ✔️

this makes vid2cleantxt significantly more robust/useful for day-to-day cases
the ability of the openAI models to translate does carry over to vid2cleantxt*! transcribe a Chinese video to English with model_id="openai/whisper-small"for example

* all testing here was done en-to-en, official testing/support for other languages to come later :)

What's Changed

Whisper by @pszemraj in #14

Full Changelog: v0.2...v0.2.5

Contributors

pszemraj

Assets 2

10 Oct 09:28

pszemraj

v0.2

f6cb118

package + python API

Finally, a python API to transcribe things instead of using a CLI or custom notebook!

See the example here on Colab.

high-level API

Install with pip:

pip install git+https://github.com/pszemraj/vid2cleantxt.git

Use in python:

import vid2cleantxt

text_output_dir, metadata_output_dir = vid2cleantxt.transcribe.transcribe_dir(
    input_dir="path/to/video/files",
    chunk_length=15,
)

What's Changed

packaging by @JonathanLehner in #9
Update v2ct_utils.py by @JonathanLehner in #10
Update transcribe.py by @JonathanLehner in #11
Add spacy workaround by @pszemraj in #12
Update docs by @pszemraj in #13

New Contributors

@JonathanLehner made their first contribution in #9

Full Changelog: v0.1.21...v0.2

Contributors

JonathanLehner and pszemraj

Assets 2

24 Feb 01:26

pszemraj

v0.1.21

cae85b7

v0.1.21 - neuspell workaround

this version is a bug fix: in NeuSpell, there is a bug loading model via the recommended API, so added code to use SymSpell if that is the case automatically. Check the log file to see/confirm that this is happening when you transcribe.

What's Changed

Add workaround for neuspell initialization bug by @pszemraj in #8

Full Changelog: v0.1.2...v0.1.21

Contributors

pszemraj

Assets 2

28 Jan 19:26

pszemraj

v0.1.2

309c420

v0.1.2

What's Changed

User friendly by @pszemraj in #3
Model updates by @pszemraj in #4
Add support for the Hubert CTC model by @pszemraj in #5

Next Release

Adding PDF generation from text files post-transcription.

Full Changelog: https://github.com/pszemraj/vid2cleantxt/commits/v0.1.2

Contributors

pszemraj

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

high-level API

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Next Release

Contributors

Releases: pszemraj/vid2cleantxt

whisper 🤫

What's Changed

Contributors

package + python API

high-level API

What's Changed

New Contributors

Contributors

v0.1.21 - neuspell workaround

What's Changed

Contributors

v0.1.2

What's Changed

Next Release

Contributors