GitHub - mrmanna/Nvidia_Nemo_FastPitch_TTS_Example: How to Build a High-Quality Text-to-Speech (TTS) System Locally with Nvidia NeMo FastPitch

After running command:

poetry install

We have to update gcc to 12 to install cython and youtokentome which are required.

conda install -c conda-forge gcc_linux-64=12

pip install cython youtokentome

and then we can run the app like

poetry run start <yourpdfile.pdf> <youroutputfile.wav>

This code includes minimal error handling. I encourage you to enhance it by addressing any issues you encounter during use.

Further if you want change pitch or speed, you can do that manipulating the spectrogram, happy coding.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
pdf2audio		pdf2audio
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
sample.pdf		sample.pdf
sample.wav		sample.wav

Provide feedback