This project is a multimedia processing application that records video with background replacement, captures audio, and provides transcription and translation using machine learning models.
- Real-time video recording with background replacement.
- Audio recording and transcription using OpenAI's Whisper.
- Translation of transcribed text using Hugging Face's translation pipeline.
-
Clone the repository:
git clone https://github.com/bniladridas/video-recorder.git cd video-recorder
-
Install dependencies:
pip install -r requirements.txt
-
Place your background image in the
backgrounds/
folder and update the path inmain.py
if necessary. -
Run the project:
python main.py
- Press
q
to stop recording. - The output video will be saved as
output.avi
. - The transcription and translation will be saved in
transcription.txt
.
graph TD;
A[video-recorder/] --> B[README.md]
A --> C[requirements.txt]
A --> D[main.py]
A --> E[backgrounds/]
E --> F[unsplash.jpg]
A --> G[.gitignore]
graph TD;
A[Start] --> B[Initialize VideoRecorder]
B --> C[Set Background Image]
C --> D[Start Recording]
D --> E[Start Audio Recording Thread]
E --> F[Capture Video Frame]
F --> G[Apply Background Replacement]
G --> H[Write Frame to Output]
H --> I{Press 'q' to Stop?}
I -->|No| F
I -->|Yes| J[Stop Recording]
J --> K[Join Audio Thread]
K --> L[Release Video Capture]
L --> M[Destroy All Windows]
M --> N[Generate Transcription]
N --> O[Save Transcription to File]
O --> P[End]
- Hardware: A good CPU and GPU are recommended for smooth real-time processing.
- Frame Rate: The default frame rate is 20 FPS; adjust if necessary based on your hardware.
- Audio Quality: Audio recording quality might vary based on your microphone.
The script suppresses specific warnings, but errors like camera or microphone access issues should be manually managed.
- Python 3.8+
- OpenCV
- MediaPipe
- PyAudio
- Whisper
- Transformers
- Torch
- NumPy
- PIL
MIT License
⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️⭐️ 10/10