This project provides a user interface for object detection using YOLO models.
It supports various input methods including image upload, video upload, webcam, screen capture, and YouTube video processing.
- Support for multiple YOLO models (nano, small, medium, large, xlarge)
- Image object detection with confidence and IoU thresholds
- Video object detection with adjustable processing speed
- Real-time object detection from webcam or screen capture
- YouTube video processing #NOTWORKINGFORNOW
- Text description of detected objects
-
Clone this repository:
git clone https://github.com/rainbowkode/XYZ-Vision cd XYZ-Vision
-
Install the required dependencies and requirements:
pip install -r requirements.txt python install-dependencies.py
-
Run the application:
python app.py
-
Open your web browser and go to the URL displayed in the terminal (usually
http://localhost:7860
). -
Use the interface to select the desired input method and adjust detection parameters.
-
Click the corresponding button to start object detection.
- Ensure you have a CUDA-capable GPU for faster processing (optional but recommended) can work with only a CPU if needed.
- The application will download the selected model if it's not already present in the working directory.
- #NOTWORKINGFORNOW - For YouTube video processing, ensure you have a stable internet connection.
- Add object tracking and trajectory analysis
- Implement 3D coordinate estimation
- Add velocity and acceleration calculations for detected objects
- Develop a separate interface for model tuning and dataset selection
- Implement stop, pause, and play controls for video processing
Contributions are welcome! Please feel free to submit a Pull Request.
For the benefit of the public, this project is open source and available under the unlicense terms. This License can change at any moment, please check this on future versions of XYZ Vision.