Whisper2Summarize

[Google Colab] Whisper2Summarize is an application that uses Whisper for audio processing and GPT for summarization. It generates summaries of audio transcripts accurately, making it ideal for a variety of use cases such as note-taking, research, and content creation.

Quick Start with Google Colab

To get started with Google Colab, you may check out the Whisper2Summarize Notebook that contains a modified version of the code that works in Google Colab.

Just add in your API Key, audio file to the session storage, and select the Whisper Model to use. (I don't suggest using medium or large as it will be incredibly slow.)

Quick Start with GUI

To immediately get started with this program, you should clone this repository and install the requirements.

git clone https://github.com/AndreDalwin/Whisper2Summarize.git
cd Whisper2Summarize
pip install -r requirements.txt
python w2sgui.py

Building from Source

I used Python 3.10.11 to build this application, but OpenAI's Whisper and GPT is expected to be compatible with Python 3.8-3.10. The code depends on a few Python packages, notably OpenAI's Whisper and GPT, their dependencies, a torch verison that supports CUDA, and rust. You have the option to install all the requirements by cloning the repository then typing:

pip install -r requirements.txt

If you have an NVIDIA GPU, follow this step. Otherwise skip it. You want to install a different version of torch that supports CUDA.

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

You will need to install OpenAI's Whisper and GPT.

pip install -U openai-whisper openai

Additionally, it also requires the command-line tool ffmpeg to be installed on your system, which is available from most package managers:

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

NOTE: To install Whisper, might need rust install as well in case you don't have pre-built wheel for your platform.

pip install setuptools-rust

Lastly, you need to clone this repository.

git clone https://github.com/AndreDalwin/Whisper2Summarize.git
cd Whisper2Summarize

Using the Program

Ensure you create a .env file in the directory containing your OpenAI API Key, you will need it to run this program.

Command Line Usage

The following command will transcribe audio files, using Whisper's medium model:

python whisper2summarize.py audio.mp3 --model medium

The default setting (which selects Whisper's base model) works well with CPU for transcribing English. I recommend using other models when trying out multilingual audio snippets.

Here is the full list of available Whisper models:

tiny, small, base, medium, large-v2

To see the requirements to run these different models, check out OpenAI's Whisper Github to learn more.

GUI Usage

You may start the GUI which allows you to select the audio file, model select, and paste in your OpenAI API Key.

python w2sgui.py

Results

Running the program will output 2 files. Transcript.txt which is the raw transcript of the audio recording, and Summary.txt which is the summarized short form of the transcript.

License

Whisper's model weights are released under the MIT License. See LICENSE for further details.

Feel free to fork this to experiment this yourself. I actually made this for my girlfriend since her class recordings are really long.

Future Plans

Implement a "Translate" feature to translate transcripts to a different language
Implement an option to change OpenAI model (gpt-3.5-turbo, text-davinci-003, gpt-4)
Print out possible errors in the GUI terminal when something bad happens

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
misc		misc
.gitignore		.gitignore
README.md		README.md
Whisper2Summarize_Colab_Edition.ipynb		Whisper2Summarize_Colab_Edition.ipynb
requirements.txt		requirements.txt
w2sgui.py		w2sgui.py
whisper2summarize.py		whisper2summarize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper2Summarize

Quick Start with Google Colab

Quick Start with GUI

Building from Source

Using the Program

Command Line Usage

GUI Usage

Results

License

Future Plans

About

Releases

Packages

Languages

AndreDalwin/Whisper2Summarize

Folders and files

Latest commit

History

Repository files navigation

Whisper2Summarize

Quick Start with Google Colab

Quick Start with GUI

Building from Source

Using the Program

Command Line Usage

GUI Usage

Results

License

Future Plans

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages