[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
-
Updated
Oct 16, 2024 - Python
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-4o 🤖💬 It also allows image generation 🖼️, image understanding 👀, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈
WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
A deep learning project to tell a story with an image or a video.
This GitHub repository shows how to integrate openai GPT-3 language model and ChatGPT API into a Unity project. It can be a useful way to add natural language processing capabilities to your application.
Collection of open datasets in computer vision.
A large-scale curated dataset of Visual.ly infographics with metadata and additional crowdsourced annotations for research applications in computer vision and natural language processing.
Annuncio generates product advertisements from user inputs, utilizing Aria for descriptions, Allegro for promotional videos, and hashtags for social media discoverability.
2022-1 Image Understanding Assignments & Projects
🏷This repository contains the lab sheets of Image Understanding & Processing (SE4130) Module in Year 4 Semester 1.
Add a description, image, and links to the image-understanding topic page so that developers can more easily learn about it.
To associate your repository with the image-understanding topic, visit your repo's landing page and select "manage topics."