Skip to content

Latest commit

 

History

History
73 lines (54 loc) · 8.14 KB

Surveys.md

File metadata and controls

73 lines (54 loc) · 8.14 KB
  • Summary of related single- and multi-modal pre-training surveys. SC and DC denotes Single Column and Double Column.
Paper Link Year Publication Topic Pages
[01] A short survey of pre-trained language models for conversational ai-a new age in nlp.
[Paper] 2020 ACSWM NLP DC, 4
[02] A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models.
[Paper] 2022 arXiv NLP SC, 34
[03] A Survey of Knowledge Enhanced Pre-trained Models.
[Paper] 2021 arXiv KE DC, 20
[04] A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models.
[Paper] 2022 arXiv KE DC, 8
[05] Commonsense Knowledge Reasoning and Generation with Pre-trained Language Models: A Survey.
[Paper] 2022 arXiv KE DC, 11
[06] A survey on contextual embeddings.
[Paper] 2020 arXiv NLP DC, 13
[07] Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.
[Paper] 2021 arXiv NLP SC, 46
[08] Pre-trained Language Models in Biomedical Domain: A Systematic Survey.
[Paper] 2021 arXiv NLP SC, 46
[09] Pre-trained models for natural language processing: A survey.
[Paper] 2020 SCTS NLP DC, 26
[10] Pre-Trained Models: Past, Present and Future.
[Paper] 2021 AI Open NLP, CV, MM DC, 45
[11] Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey.
[Paper] 2021 arXiv NLP DC, 49
[12] A Survey of Vision-Language Pre-Trained Models.
[Paper] 2022 arXiv MM DC, 9
[13] Survey: Transformer based video-language pre-training.
[Paper] 2022 AI Open CV DC, 13
[14] Vision-Language Intelligence: Tasks, Representation Learning, and Large Models.
[Paper] 2022 arXiv MM DC, 19
[15] A survey on vision transformer.
[Paper] 2022 TPAMI CV DC, 23
[16] Transformers in vision: A survey.
[Paper] 2021 CSUR CV SC, 38
[17] A Survey of Visual Transformers.
[Paper] 2021 arXiv CV DC, 21
[18] Video Transformers: A Survey.
[Paper] 2022 arXiv CV DC, 24
[19] Threats to Pre-trained Language Models: Survey and Taxonomy.
[Paper] 2022 arXiv NLP DC, 8
[20] A survey on bias in deep NLP.
[Paper] 2021 AS NLP SC, 26
[21] A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models.
[Paper] 2022 arXiv NLP SC, 34
[22] An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-Trained Language Models.
[Paper] 2021 arXiv NLP DC, 21
[23] A multi-layer bidirectional transformer encoder for pre-trained word embedding: A survey of BERT.
[Paper] 2020 CCDSE NLP DC, 5
[24] Survey of Pre-trained Models for Natural Language Processing.
[Paper] 2021 ICEIB NLP DC, 4
[25] A Roadmap for Big Model.
[Paper] 2022 arXiv NLP, CV, MM SC, 200
[26] Vision-and-Language Pretrained Models: A Survey.
[Paper] 2022 IJCAI MM DC, 8
[27] Multimodal Learning with Transformers: A Survey.
[Paper] 2022 arXiv MM DC, 23
[28] MM-LLMs: Recent Advances in MultiModal Large Language Models.
[Paper] 2024 arXiv MM DC, 22
  • The (R)Evolution of Multimodal Large Language Models: A Survey, Davide Caffagni, Federico Cocchi, Luca Barsellotti, Nicholas Moratelli, Sara Sarto, Lorenzo Baraldi, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara [Paper]

  • Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models, Yixin Liu, Kai Zhang, Yuan Li, Zhiling Yan, Chujie Gao, Ruoxi Chen, Zhengqing Yuan, Yue Huang, Hanchi Sun, Jianfeng Gao, Lifang He, Lichao Sun [Paper]

  • On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models, Xinpeng Wang, Shitong Duan, Xiaoyuan Yi, Jing Yao, Shanlin Zhou, Zhihua Wei, Peng Zhang, Dongkuan Xu, Maosong Sun, Xing Xie [Paper]

  • [arXiv:2405.10739] Efficient Multimodal Large Language Models: A Survey, Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma [Paper] [Code]

  • The Evolution of Multimodal Model Architectures, Shakti N. Wadekar, Abhishek Chaurasia, Aman Chadha, Eugenio Culurciello [Paper]

  • [arXiv:2405.17247] An Introduction to Vision-Language Modeling, Florian Bordes et al. [Paper]

  • [arXiv:2406.09385] Towards Vision-Language Geo-Foundation Model: A Survey, Yue Zhou, Litong Feng, Yiping Ke, Xue Jiang, Junchi Yan, Xue Yang, Wayne Zhang [Paper] [Github]

  • [arXiv:2407.15017] Knowledge Mechanisms in Large Language Models: A Survey and Perspective, Mengru Wang, Yunzhi Yao, Ziwen Xu, Shuofei Qiao, Shumin Deng, Peng Wang, Xiang Chen, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang [Paper]

  • [arXiv:2408.01319] A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks, Jiaqi Wang, Hanqi Jiang, Yiheng Liu, Chong Ma, Xu Zhang, Yi Pan, Mengyuan Liu, Peiran Gu, Sichen Xia, Wenjun Li, Yutong Zhang, Zihao Wu, Zhengliang Liu, Tianyang Zhong, Bao Ge, Tuo Zhang, Ning Qiang, Xintao Hu, Xi Jiang, Xin Zhang, Wei Zhang, Dinggang Shen, Tianming Liu, Shu Zhang [Paper]