WhisCall

A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.

Demo.mp4

Tools used in this framework

Whisper (Speech to Text)
OpenAI GPT 3.5 Turbo
Coqui TTS
Virtual Audio Cable
WhatsApp Desktop App

Installation

Install VB-Audio Cable

Note: You need two separate Virtual Audio Cables. I am using VB Audio Cable and (VAC) Virtual Audio Cable. Install both.

Install Whatsapp Desktop Version

Download Whatsapp

Now Clone the Repo

  https://github.com/skshadan/WhisCall.git

  pip install -r requirements.txt

Find Speaker And Microphone Index

Run the below code to find the index of your virtual audio cable for the microphone and speaker.

import pyaudio

def list_audio_devices():
    p = pyaudio.PyAudio()
    info = p.get_host_api_info_by_index(0)
    num_devices = info.get('deviceCount')

    # Lists of devices to return
    speakers = []
    microphones = []

    # Scan through devices and add to list
    for i in range(0, num_devices):
        device = p.get_device_info_by_index(i)
        if device.get('maxInputChannels') > 0:
            microphones.append((i, device.get('name')))
        if device.get('maxOutputChannels') > 0:
            speakers.append((i, device.get('name')))

    p.terminate()
    return microphones, speakers

microphones, speakers = list_audio_devices()

print("Microphones:")
for idx, name in microphones:
    print(f"Index: {idx}, Name: {name}")

print("\nSpeakers:")
for idx, name in speakers:
    print(f"Index: {idx}, Name: {name}")

Select Input & Output for Microphone and Speaker in WhatsApp App

Run the code

main.py

from voice import select_microphone, transcribe_audio
from response import generate_response, text_to_speech, PlayAudio

def main():
    mic_index = select_microphone()
    for text in transcribe_audio(mic_index):
        if text:
            gpt_response = generate_response(text)
            text_to_speech(gpt_response) 
            PlayAudio()


if __name__ == "__main__":
    main()

Select the Microphone Index. The TTS & Whisper will load, and that's it!!!

If you want different voices, you need to change the TTS model as follows:

Download Models From Here:

Facing Any Issues?

Feel free to ask if you are having any issues. Also, feel free to contribute.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
images		images
LICENSE		LICENSE
README.md		README.md
main.py		main.py
output.wav		output.wav
requirements.txt		requirements.txt
response.py		response.py
voice.py		voice.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhisCall

Tools used in this framework

Installation

Install Build Tools from Visual Studio 2022

Install CUDA Toolkit 12.4

Install Espeak

Install VB-Audio Cable

Install Whatsapp Desktop Version

Now Clone the Repo

Find Speaker And Microphone Index

Select Input & Output for Microphone and Speaker in WhatsApp App

Run the code

main.py

Select the Microphone Index. The TTS & Whisper will load, and that's it!!!

Facing Any Issues?

fin.

About

Releases

Packages

Languages

License

skshadan/WhisCall

Folders and files

Latest commit

History

Repository files navigation

WhisCall

Tools used in this framework

Installation

Install Build Tools from Visual Studio 2022

Install CUDA Toolkit 12.4

Install Espeak

Install VB-Audio Cable

Install Whatsapp Desktop Version

Now Clone the Repo

Find Speaker And Microphone Index

Select Input & Output for Microphone and Speaker in WhatsApp App

Run the code

main.py

Select the Microphone Index. The TTS & Whisper will load, and that's it!!!

Facing Any Issues?

fin.

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages