This code is built upon a speech analysis and synthesis system from paper [1]. The system is the so-called extended adaptive Quasi-Harmonic Model (eaQHM) and this is the Python source code implementation.
eaQHMAnalysisAndSynthesis is a function that performs extended adaptive Quasi-Harmonic Analysis and Synthesis of a speech signal. In other words, it receives a .wav file and some other optional parameters and decomposes speech into AM-FM components according to the model mentioned, by producing a fully deterministic component of the signal and then iteratively refining it, until the reconstructed signal converges in quasi-harmonicity [1].
Python 3.8.x and up. It is also highly suggested to use Spyder environment as the whole code was tested in it. Before you run the code, make sure to install all requirements by executing:
pip install -r requirements.txt
- Preprocessing high pass filter option.
- SWIPEP pitch estimator is used for the
f0
estimations, implemented in Python by Disha Garg: https://github.com/dishagarg/SWIPE- The user may use custom pitch limits for the estimation.
- Full-band analysis.
- Either full waveform or only voiced parts are analyzed.
- Plots are viewed showing the signal in time and frequency domain before and after reconstruction.
- A basic loading screen may be displayed by setting
loadingScreen=True
as a parameter, which will enable a tqdm loading bar in the console.
- Prompts may be disabled for speed by setting
printPrompts=True
as a parameter.
A main.py file is provided, which executes eaQHMAnalysisAndSynthesis on a speech signal, whose name is given as an input on the console. The gender of the speaker may also be specified. A sample from a female speaker, named SA19.wav is also included, but you can use any mono .wav file you want.
What you have to do is:
- Open main.py.
- Run the code.
- A file dialog should open where you must select the .wav file of your choice.
- Specify the gender of the speaker ("male", "female" or other). You may also use "child" as an input.
- The program will print some prompts showing the Signal-to-Reconstruction-Ratio (SRER) [1] of each adaptation and some plots will be generated.
- After the program terminates, a *filename*_reconstructed.wav file will be generated.
Here is an example of the output of the code running SA19.wav:
And here are the plots produced:
If you wish to use this code, please cite it using the following:
- Panagiotis Antivasis, eaQHM analysis and synthesis system, (2021), GitHub repository, https://github.com/Antibas/eaQHM-analysis-and-synthesis-in-Python
If you wish to cite the main paper on which this code has been developed, please use the following: