Reading list for research topics in multimodal machine learning
-
Updated
Aug 20, 2024
Reading list for research topics in multimodal machine learning
An open-source framework for training large multimodal models.
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
A curated list of Multimodal Related Research.
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
ICCV 2023 Papers: Discover cutting-edge research from ICCV 2023, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
A Comparative Framework for Multimodal Recommender Systems
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
Papers, code and datasets about deep learning and multi-modal learning for video analysis
[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation
A collection of resources on applications of multi-modal learning in medical imaging.
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
Multi-Modal learning toolkit based on PaddlePaddle and PyTorch, supporting multiple applications such as multi-modal classification, cross-modal retrieval and image caption.
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
A curated list of awesome vision and language resources (still under construction... stay tuned!)
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Multi-modality pre-training
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the 🔥PyTorch ecosystem. ⭐ Star to support our work!
Add a description, image, and links to the multimodal-learning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-learning topic, visit your repo's landing page and select "manage topics."