An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
-
Updated
Apr 12, 2024 - Python
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
[ACM MM 2017 & IEEE TMM 2020] This is the Theano code for the paper "Video Description with Spatial Temporal Attention"
Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy
Source code for Delving Deeper into the Decoder for Video Captioning
Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
Python implementation of extraction of several visual features representations from videos
Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*
[Pattern Rcognition 2021] This is the Theano code for our paper "Enhancing the Alignment between Target Words and Corresponding Frames for Video Captioning".
To build attention based encoder-decoder model for video captioning on the MSVD dataset
This project utilizes advanced deep learning techniques to automatically generate contextually relevant captions for videos by extracting spatial and temporal features, while incorporating Gaussian attention to focus on important regions. This enhances video indexing, retrieval, and accessibility for visually impaired individuals.
MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian (Bahasa Indonesia).
LSTM RNN and Transformer networks video captioning on MSVD and MSR-VTT using attributes and SVOS
Add a description, image, and links to the msvd topic page so that developers can more easily learn about it.
To associate your repository with the msvd topic, visit your repo's landing page and select "manage topics."