Skip to content

Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".

License

Notifications You must be signed in to change notification settings

yongliang-wu/MM-VID

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MM-Vid: Advancing Video Understanding with GPT-4V(ision)

This repository contains the open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".

Overview

The goal of this project is to advance video understanding by leveraging the capabilities of GPT-4V(ision). The implementation follows the methodologies and experiments described in the paper, providing a comprehensive framework for scene detection, video clipping, speech recognition, and generating coherent video descriptions.

Installation

To use this repository, first clone the repository and install the required dependencies.

git clone https://github.com/yongliang-wu/MM-VID.git
cd MM-VID
pip install -r requirements.txt

Then run the code

python main.py

TODO

The input of external information is not supported yet.

About

Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages