- AGI
- Activation Function
- Attention
- CNNs
- CV
- Dataset
- GANs
- Generative
- GraphNN
- Image
- LSTMs
- Large-Language-Models
- Loss Function
- NMT
- NN Initialization
- NNs
- Normalization
- Optimizers
- Other
- Pose Estimation
- Question-Answering
- Segmentation
- Siamese Network
- Tabular
- Text
- Training Method
- Transformers
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning | Pending | AGI, Dataset, Text | 2019 | AAAI | Maarten Sap, Noah A. Smith, Ronan Le Bras, Yejin Choi | link | ||
1 | COMET: Commonsense Transformers for Automatic Knowledge Graph Construction | Pending | AGI, Text , Transformers | 2019 | ACL | Antoine Bosselut, Hannah Rashkin, Yejin Choi | link | ||
2 | VisualCOMET: Reasoning about the Dynamic Context of a Still Image | Pending | AGI, Dataset, Image , Text , Transformers | 2020 | ECCV | Ali Farhadi, Chandra Bhagavatula, Jae Sung Park, Yejin Choi | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Self-Normalizing Neural Networks | Pending | Activation Function, Tabular | Optimizations, Tips & Tricks | 2017 | NIPS | Andreas Mayr, Günter Klambauer, Thomas Unterthiner | link | |
1 | A Comprehensive Guide on Activation Functions | This week | Activation Function | 2020 | Blog | Ygor Rebouças Serpa | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Attention is All you Need | Read | Attention, Text , Transformers | Architecture | 2017 | NIPS | Ashish Vaswani, Illia Polosukhin, Noam Shazeer, Łukasz Kaiser | Talks about Transformer architecture which brings SOTA performance for different tasks in NLP | link |
1 | GPT-2 (Language Models are Unsupervised Multitask Learners) | Pending | Attention, Text , Transformers | 2019 | Alec Radford, Dario Amodei, Ilya Sutskever, Jeffrey Wu | link | |||
2 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | Read | Attention, Text , Transformers | Embeddings | 2018 | NAACL | Jacob Devlin, Kenton Lee, Kristina Toutanova, Ming-Wei Chang | BERT is an extension to Transformer based architecture which introduces a masked word pretraining and next sentence prediction task to pretrain the model for a wide variety of tasks. | link |
3 | SAGAN: Self-Attention Generative Adversarial Networks | Pending | Attention, GANs, Image | Architecture | 2018 | arXiv | Augustus Odena, Dimitris Metaxas, Han Zhang, Ian Goodfellow | link | |
4 | Single Headed Attention RNN: Stop Thinking With Your Head | Pending | Attention, LSTMs, Text | Optimization-No. of params | 2019 | arXiv | Stephen Merity | link | |
5 | Reformer: The Efficient Transformer | Read | Attention, Text , Transformers | Architecture, Optimization-Memory, Optimization-No. of params | 2020 | arXiv | Anselm Levskaya, Lukasz Kaiser, Nikita Kitaev | Overcome time and memory complexity of Transformers by bucketing Query, Keys and using Reversible residual connections. | link |
6 | Language-Agnostic BERT Sentence Embedding | Read | Attention, Siamese Network, Text , Transformers | Embeddings | 2020 | arXiv | Fangxiaoyu Feng, Yinfei Yang | A BERT model with multilingual sentence embeddings learned over 112 languages and Zero-shot learning over unseen languages. | link |
7 | T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | Read | Attention, Text , Transformers | 2020 | JMLR | Colin Raffel, Noam Shazeer, Peter J. Liu, Wei Liu, Yanqi Zhou | Presents a Text-to-Text transformer model with multi-task learning capabilities, simultaneously solving problems such as machine translation, document summarization, question answering, and classification tasks. | link | |
8 | GPT-f: Generative Language Modeling for Automated Theorem Proving | Pending | Attention, Transformers | 2020 | arXiv | Ilya Sutskever, Stanislas Polu | link | ||
9 | Vision Transformer: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Pending | Attention, Image , Transformers | 2021 | ICLR | Alexey Dosovitskiy, Jakob Uszkoreit, Lucas Beyer, Neil Houlsby | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | ZF Net (Visualizing and Understanding Convolutional Networks) | Read | CNNs, CV , Image | Visualization | 2014 | ECCV | Matthew D. Zeiler, Rob Fergus | Visualize CNN Filters / Kernels using De-Convolutions on CNN filter activations. | link |
1 | Inception-v1 (Going Deeper With Convolutions) | Read | CNNs, CV , Image | Architecture | 2015 | CVPR | Christian Szegedy, Wei Liu | Propose the use of 1x1 conv operations to reduce the number of parameters in a deep and wide CNN | link |
2 | ResNet (Deep Residual Learning for Image Recognition) | Read | CNNs, CV , Image | Architecture | 2016 | CVPR | Kaiming He, Xiangyu Zhang | Introduces Residual or Skip Connections to allow increase in the depth of a DNN | link |
3 | MobileNet (Efficient Convolutional Neural Networks for Mobile Vision Applications) | Pending | CNNs, CV , Image | Architecture, Optimization-No. of params | 2017 | arXiv | Andrew G. Howard, Menglong Zhu | link | |
4 | Evaluation of neural network architectures for embedded systems | Read | CNNs, CV , Image | Comparison | 2017 | IEEE ISCAS | Adam Paszke, Alfredo Canziani, Eugenio Culurciello | Compare CNN classification architectures on accuracy, memory footprint, parameters, operations count, inference time and power consumption. | link |
5 | SqueezeNet | Read | CNNs, CV , Image | Architecture, Optimization-No. of params | 2016 | arXiv | Forrest N. Iandola, Song Han | Explores model compression by using 1x1 convolutions called fire modules. | link |
6 | Pruning Filters for Efficient ConvNets | Pending | CNNs, CV , Image | Optimization-No. of params | 2017 | arXiv | Asim Kadav, Hao Li | link | |
7 | Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet | Reading | CNNs, CV , Image | 2019 | arXiv | Matthias Bethge, Wieland Brendel | link | ||
8 | Breaking neural networks with adversarial attacks | Pending | CNNs, Image | Adversarial | 2019 | Blog | Anant Jain | link | |
9 | Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs | Pending | CNNs, Image | 2020 | arXiv | Ari S. Morcos, David J. Schwab, Jonathan Frankle | link | ||
10 | Arbitrary Style Transfer in Real-Time With Adaptive Instance Normalization | Pending | CNNs, Image | 2017 | ICCV | Serge Belongie, Xun Huang | link | ||
11 | Few-Shot Learning with Localization in Realistic Settings | Pending | CNNs, Image | Few-shot-learning | 2019 | CVPR | Bharath Hariharan, Davis Wertheimer | link | |
12 | Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition | Pending | CNNs, Image | Few-shot-learning | 2020 | CVPR | Bharath Hariharan, Davis Wertheimer, Luming Tang | link | |
13 | Occupancy Anticipation for Efficient Exploration and Navigation | Pending | CNNs, Image | Reinforcement-Learning | 2020 | ECCV | Kristen Grauman, Santhosh K. Ramakrishnan, Ziad Al-Halah | link | |
14 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | ZF Net (Visualizing and Understanding Convolutional Networks) | Read | CNNs, CV , Image | Visualization | 2014 | ECCV | Matthew D. Zeiler, Rob Fergus | Visualize CNN Filters / Kernels using De-Convolutions on CNN filter activations. | link |
1 | Inception-v1 (Going Deeper With Convolutions) | Read | CNNs, CV , Image | Architecture | 2015 | CVPR | Christian Szegedy, Wei Liu | Propose the use of 1x1 conv operations to reduce the number of parameters in a deep and wide CNN | link |
2 | ResNet (Deep Residual Learning for Image Recognition) | Read | CNNs, CV , Image | Architecture | 2016 | CVPR | Kaiming He, Xiangyu Zhang | Introduces Residual or Skip Connections to allow increase in the depth of a DNN | link |
3 | MobileNet (Efficient Convolutional Neural Networks for Mobile Vision Applications) | Pending | CNNs, CV , Image | Architecture, Optimization-No. of params | 2017 | arXiv | Andrew G. Howard, Menglong Zhu | link | |
4 | Evaluation of neural network architectures for embedded systems | Read | CNNs, CV , Image | Comparison | 2017 | IEEE ISCAS | Adam Paszke, Alfredo Canziani, Eugenio Culurciello | Compare CNN classification architectures on accuracy, memory footprint, parameters, operations count, inference time and power consumption. | link |
5 | SqueezeNet | Read | CNNs, CV , Image | Architecture, Optimization-No. of params | 2016 | arXiv | Forrest N. Iandola, Song Han | Explores model compression by using 1x1 convolutions called fire modules. | link |
6 | Pruning Filters for Efficient ConvNets | Pending | CNNs, CV , Image | Optimization-No. of params | 2017 | arXiv | Asim Kadav, Hao Li | link | |
7 | A 2019 guide to Human Pose Estimation with Deep Learning | Pending | CV , Pose Estimation | Comparison | 2019 | Blog | Sudharshan Chandra Babu | link | |
8 | A Simple yet Effective Baseline for 3D Human Pose Estimation | Pending | CV , Pose Estimation | 2017 | ICCV | James J. Little, Javier Romero, Julieta Martinez, Rayat Hossain | link | ||
9 | Bag of Tricks for Image Classification with Convolutional Neural Networks | Read | CV , Image | Optimizations, Tips & Tricks | 2018 | arXiv | Tong He, Zhi Zhang | Shows a dozen tricks (mixup, label smoothing, etc.) to improve CNN accuracy and training time. | link |
10 | Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet | Reading | CNNs, CV , Image | 2019 | arXiv | Matthias Bethge, Wieland Brendel | link | ||
11 | Capsule Networks: Dynamic Routing Between Capsules | Pending | CV , Image | Architecture | 2017 | arXiv | Geoffrey E Hinton, Nicholas Frosst, Sara Sabour | link | |
12 | Understanding Loss Functions in Computer Vision | Pending | CV , GANs, Image , Loss Function | Comparison, Tips & Tricks | 2020 | Blog | Sowmya Yellapragada | link | |
13 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning | Pending | AGI, Dataset, Text | 2019 | AAAI | Maarten Sap, Noah A. Smith, Ronan Le Bras, Yejin Choi | link | ||
1 | VisualCOMET: Reasoning about the Dynamic Context of a Still Image | Pending | AGI, Dataset, Image , Text , Transformers | 2020 | ECCV | Ali Farhadi, Chandra Bhagavatula, Jae Sung Park, Yejin Choi | link | ||
2 | Symbolic Knowledge Distillation: from General Language Models to Commonsense Models | Pending | Dataset, Text , Transformers | Optimizations, Tips & Tricks | 2021 | arXiv | Chandra Bhagavatula, Jack Hessel, Peter West, Yejin Choi | link | |
3 | Large Language Models for Data Annotation: A Survey | This week | Dataset, Generative, Large-Language-Models | Prompting, Tips & Tricks | 2024 | arXiv | Alimohammad Beigi, Zhen Tan | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | SAGAN: Self-Attention Generative Adversarial Networks | Pending | Attention, GANs, Image | Architecture | 2018 | arXiv | Augustus Odena, Dimitris Metaxas, Han Zhang, Ian Goodfellow | link | |
1 | Pix2Pix: Image-to-Image Translation with Conditional Adversarial Nets | Read | GANs, Image | 2017 | CVPR | Alexei A. Efros, Jun-Yan Zhu, Phillip Isola, Tinghui Zhou | Image to image translation using Conditional GANs and dataset of image pairs from one domain to another. | link | |
2 | CycleGAN: Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks | Pending | GANs, Image | Architecture | 2017 | ICCV | Alexei A. Efros, Jun-Yan Zhu, Phillip Isola, Taesung Park | link | |
3 | Unsupervised Machine Translation Using Monolingual Corpora Only | Pending | GANs, NMT, Text , Transformers | Unsupervised | 2017 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
4 | WGAN: Wasserstein GAN | Pending | GANs, Loss Function | 2017 | arXiv | Léon Bottou, Martin Arjovsky, Soumith Chintala | link | ||
5 | Spectral Normalization for GANs | Pending | GANs, Normalization | Optimizations | 2018 | arXiv | Masanori Koyama, Takeru Miyato, Toshiki Kataoka, Yuichi Yoshida | link | |
6 | Understanding Loss Functions in Computer Vision | Pending | CV , GANs, Image , Loss Function | Comparison, Tips & Tricks | 2020 | Blog | Sowmya Yellapragada | link | |
7 | StyleGAN: A Style-Based Generator Architecture for Generative Adversarial Networks | Pending | GANs, Image | 2019 | CVPR | Samuli Laine, Tero Karras, Timo Aila | link | ||
8 | Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? | Pending | GANs, Image | 2019 | ICCV | Peter Wonka, Rameen Abdal, Yipeng Qin | link | ||
9 | Improved Techniques for Training GANs | Pending | GANs, Image | Semi-Supervised | 2016 | NIPS | Alec Radford, Ian Goodfellow, Tim Salimans, Vicki Cheung, Wojciech Zaremba, Xi Chen | link | |
10 | AnimeGAN: Towards the Automatic Anime Characters Creation with Generative Adversarial Networks | Pending | GANs, Image | 2017 | NIPS | Jiakai Zhang, Minjun Li, Yanghua Jin | link | ||
11 | Progressive Growing of GANs for Improved Quality, Stability, and Variation | Pending | GANs, Image | Tips & Tricks | 2018 | ICLR | Jaakko Lehtinen, Samuli Laine, Tero Karras, Timo Aila | link | |
12 | BEGAN: Boundary Equilibrium Generative Adversarial Networks | Pending | GANs, Image | 2017 | arXiv | David Berthelot, Luke Metz, Thomas Schumm | link | ||
13 | StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation | Pending | GANs, Image | 2018 | CVPR | Jaegul Choo, Jung-Woo Ha, Minje Choi, Munyoung Kim, Sunghun Kim, Yunjey Choi | link | ||
14 | IMLE-GAN: Inclusive GAN: Improving Data and Minority Coverage in Generative Models | Pending | GANs | 2020 | arXiv | Jitendra Malik, Ke Li, Larry Davis, Mario Fritz, Ning Yu, Peng Zhou | link | ||
15 | TransGAN: Two Transformers Can Make One Strong GAN | Pending | GANs, Image , Transformers | Architecture | 2021 | arXiv | Shiyu Chang, Yifan Jiang, Zhangyang Wang | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Transforming Sequence Tagging Into A Seq2Seq Task | Pending | Generative, Text | Comparison, Tips & Tricks | 2022 | arXiv | Iftekhar Naim, Karthik Raman, Krishna Srinivasan | link | |
1 | Large Language Models are Zero-Shot Reasoners | Pending | Generative, Question-Answering, Text | Tips & Tricks, Zero-shot-learning | 2022 | arXiv | Takeshi Kojima, Yusuke Iwasawa | link | |
2 | Flan-T5: Scaling Instruction-Finetuned Language Models | Pending | Generative, Text , Transformers | Architecture, Pre-Training | 2022 | arXiv | Hyung Won Chung, Le Hou | link | |
3 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
4 | Scaling Instruction-Finetuned Language Models (FLAN) | Pending | Generative, Large-Language-Models, Question-Answering, Text , Transformers | Instruction-Finetuning | 2022 | arXiv | Hyung Won Chung, Jason Wei, Jeffrey Dean, Le Hou, Quoc V. Le, Shayne Longpre | https://arxiv.org/abs/2210.11416 introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated. | link |
5 | ReAct: Synergizing Reasoning and Acting in Language Models | Pending | Generative, Large-Language-Models, Text | Optimizations, Tips & Tricks | 2023 | ICLR | Dian Yu, Izhak Shafran, Jeffrey Zhao, Karthik Narasimhan, Nan Du, Shunyu Yao, Yuan Cao | This paper introduces ReAct, a novel approach that leverages Large Language Models (LLMs) to interleave reasoning traces and task-specific actions. ReAct outperforms existing methods on various language and decision-making tasks, addressing issues like hallucination, error propagation, and improving human interpretability and trustworthiness. | link |
6 | Training language models to follow instructions with human feedback | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning, Reinforcement-Learning, Semi-Supervised | 2022 | arXiv | Carroll L. Wainwright, Diogo Almeida, Jan Leike, Jeff Wu, Long Ouyang, Pamela Mishkin, Paul Christiano, Ryan Lowe, Xu Jiang | This paper presents InstructGPT, a model fine-tuned with human feedback to better align with user intent across various tasks. Despite having significantly fewer parameters than larger models, InstructGPT outperforms them in human evaluations, demonstrating improved truthfulness, reduced toxicity, and minimal performance regressions on public NLP datasets, highlighting the potential of fine-tuning with human feedback for enhancing language model alignment with human intent. | link |
7 | Constitutional AI: Harmlessness from AI Feedback | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning, Reinforcement-Learning, Unsupervised | 2022 | arXiv | Jared Kaplan, Yuntao Ba | The paper introduces Constitutional AI, a method for training a safe AI assistant without human-labeled data on harmful outputs. It combines supervised learning and reinforcement learning phases, enabling the AI to engage with harmful queries by explaining its objections, thus improving control, transparency, and human-judged performance with minimal human oversight. | link |
8 | Self-Alignment with Instruction Backtranslation | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning | 2023 | arXiv | Jason Weston, Mike Lewis, Ping Yu, Xian Li | The paper introduces a scalable method called "instruction backtranslation" to create a high-quality instruction-following language model. This method involves self-augmentation and self-curation of training examples generated from web documents, resulting in a model that outperforms others in its category without relying on distillation data, showcasing its effective self-alignment capability. | link |
9 | Table-GPT: Table-tuned GPT for Diverse Table Tasks | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning | 2023 | arXiv | link | ||
10 | Large Language Models for Data Annotation: A Survey | This week | Dataset, Generative, Large-Language-Models | Prompting, Tips & Tricks | 2024 | arXiv | Alimohammad Beigi, Zhen Tan | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Graph Neural Network: Relational inductive biases, deep learning, and graph networks | Pending | GraphNN | Architecture | 2018 | arXiv | Jessica B. Hamrick, Oriol Vinyals, Peter W. Battaglia | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | ZF Net (Visualizing and Understanding Convolutional Networks) | Read | CNNs, CV , Image | Visualization | 2014 | ECCV | Matthew D. Zeiler, Rob Fergus | Visualize CNN Filters / Kernels using De-Convolutions on CNN filter activations. | link |
1 | Inception-v1 (Going Deeper With Convolutions) | Read | CNNs, CV , Image | Architecture | 2015 | CVPR | Christian Szegedy, Wei Liu | Propose the use of 1x1 conv operations to reduce the number of parameters in a deep and wide CNN | link |
2 | ResNet (Deep Residual Learning for Image Recognition) | Read | CNNs, CV , Image | Architecture | 2016 | CVPR | Kaiming He, Xiangyu Zhang | Introduces Residual or Skip Connections to allow increase in the depth of a DNN | link |
3 | MobileNet (Efficient Convolutional Neural Networks for Mobile Vision Applications) | Pending | CNNs, CV , Image | Architecture, Optimization-No. of params | 2017 | arXiv | Andrew G. Howard, Menglong Zhu | link | |
4 | Evaluation of neural network architectures for embedded systems | Read | CNNs, CV , Image | Comparison | 2017 | IEEE ISCAS | Adam Paszke, Alfredo Canziani, Eugenio Culurciello | Compare CNN classification architectures on accuracy, memory footprint, parameters, operations count, inference time and power consumption. | link |
5 | SqueezeNet | Read | CNNs, CV , Image | Architecture, Optimization-No. of params | 2016 | arXiv | Forrest N. Iandola, Song Han | Explores model compression by using 1x1 convolutions called fire modules. | link |
6 | Pruning Filters for Efficient ConvNets | Pending | CNNs, CV , Image | Optimization-No. of params | 2017 | arXiv | Asim Kadav, Hao Li | link | |
7 | SAGAN: Self-Attention Generative Adversarial Networks | Pending | Attention, GANs, Image | Architecture | 2018 | arXiv | Augustus Odena, Dimitris Metaxas, Han Zhang, Ian Goodfellow | link | |
8 | Bag of Tricks for Image Classification with Convolutional Neural Networks | Read | CV , Image | Optimizations, Tips & Tricks | 2018 | arXiv | Tong He, Zhi Zhang | Shows a dozen tricks (mixup, label smoothing, etc.) to improve CNN accuracy and training time. | link |
9 | Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet | Reading | CNNs, CV , Image | 2019 | arXiv | Matthias Bethge, Wieland Brendel | link | ||
10 | Breaking neural networks with adversarial attacks | Pending | CNNs, Image | Adversarial | 2019 | Blog | Anant Jain | link | |
11 | Pix2Pix: Image-to-Image Translation with Conditional Adversarial Nets | Read | GANs, Image | 2017 | CVPR | Alexei A. Efros, Jun-Yan Zhu, Phillip Isola, Tinghui Zhou | Image to image translation using Conditional GANs and dataset of image pairs from one domain to another. | link | |
12 | CycleGAN: Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks | Pending | GANs, Image | Architecture | 2017 | ICCV | Alexei A. Efros, Jun-Yan Zhu, Phillip Isola, Taesung Park | link | |
13 | Capsule Networks: Dynamic Routing Between Capsules | Pending | CV , Image | Architecture | 2017 | arXiv | Geoffrey E Hinton, Nicholas Frosst, Sara Sabour | link | |
14 | Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs | Pending | CNNs, Image | 2020 | arXiv | Ari S. Morcos, David J. Schwab, Jonathan Frankle | link | ||
15 | Arbitrary Style Transfer in Real-Time With Adaptive Instance Normalization | Pending | CNNs, Image | 2017 | ICCV | Serge Belongie, Xun Huang | link | ||
16 | One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction | Pending | Image , Text | 2020 | arXiv | Jun Huang, Mengli Cheng, Minghui Qiu, Wei Lin, Xing Shi | link | ||
17 | Topological Loss: Beyond the Pixel-Wise Loss for Topology-Aware Delineation | Pending | Image , Loss Function, Segmentation | 2018 | CVPR | Agata Mosinska, Mateusz Koziński, Pablo Márquez-Neila, Pascal Fua | link | ||
18 | Understanding Loss Functions in Computer Vision | Pending | CV , GANs, Image , Loss Function | Comparison, Tips & Tricks | 2020 | Blog | Sowmya Yellapragada | link | |
19 | StyleGAN: A Style-Based Generator Architecture for Generative Adversarial Networks | Pending | GANs, Image | 2019 | CVPR | Samuli Laine, Tero Karras, Timo Aila | link | ||
20 | Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? | Pending | GANs, Image | 2019 | ICCV | Peter Wonka, Rameen Abdal, Yipeng Qin | link | ||
21 | Improved Techniques for Training GANs | Pending | GANs, Image | Semi-Supervised | 2016 | NIPS | Alec Radford, Ian Goodfellow, Tim Salimans, Vicki Cheung, Wojciech Zaremba, Xi Chen | link | |
22 | AnimeGAN: Towards the Automatic Anime Characters Creation with Generative Adversarial Networks | Pending | GANs, Image | 2017 | NIPS | Jiakai Zhang, Minjun Li, Yanghua Jin | link | ||
23 | Progressive Growing of GANs for Improved Quality, Stability, and Variation | Pending | GANs, Image | Tips & Tricks | 2018 | ICLR | Jaakko Lehtinen, Samuli Laine, Tero Karras, Timo Aila | link | |
24 | BEGAN: Boundary Equilibrium Generative Adversarial Networks | Pending | GANs, Image | 2017 | arXiv | David Berthelot, Luke Metz, Thomas Schumm | link | ||
25 | StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation | Pending | GANs, Image | 2018 | CVPR | Jaegul Choo, Jung-Woo Ha, Minje Choi, Munyoung Kim, Sunghun Kim, Yunjey Choi | link | ||
26 | Few-Shot Learning with Localization in Realistic Settings | Pending | CNNs, Image | Few-shot-learning | 2019 | CVPR | Bharath Hariharan, Davis Wertheimer | link | |
27 | Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition | Pending | CNNs, Image | Few-shot-learning | 2020 | CVPR | Bharath Hariharan, Davis Wertheimer, Luming Tang | link | |
28 | VisualCOMET: Reasoning about the Dynamic Context of a Still Image | Pending | AGI, Dataset, Image , Text , Transformers | 2020 | ECCV | Ali Farhadi, Chandra Bhagavatula, Jae Sung Park, Yejin Choi | link | ||
29 | Occupancy Anticipation for Efficient Exploration and Navigation | Pending | CNNs, Image | Reinforcement-Learning | 2020 | ECCV | Kristen Grauman, Santhosh K. Ramakrishnan, Ziad Al-Halah | link | |
30 | Vision Transformer: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Pending | Attention, Image , Transformers | 2021 | ICLR | Alexey Dosovitskiy, Jakob Uszkoreit, Lucas Beyer, Neil Houlsby | link | ||
31 | DALL·E: Creating Images from Text | Pending | Image , Text , Transformers | 2021 | Blog | Aditya Ramesh, Gabriel Goh, Ilya Sutskever, Mikhail Pavlov, Scott Gray | link | ||
32 | CLIP: Connecting Text and Images | Pending | Image , Text , Transformers | Multimodal, Pre-Training | 2021 | arXiv | Alec Radford, Ilya Sutskever, Jong Wook Kim | link | |
33 | Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision | This week | Image , Text , Transformers | Multimodal | 2020 | EMNLP | Hao Tan, Mohit Bansal | link | |
34 | TransGAN: Two Transformers Can Make One Strong GAN | Pending | GANs, Image , Transformers | Architecture | 2021 | arXiv | Shiyu Chang, Yifan Jiang, Zhangyang Wang | link | |
35 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Single Headed Attention RNN: Stop Thinking With Your Head | Pending | Attention, LSTMs, Text | Optimization-No. of params | 2019 | arXiv | Stephen Merity | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Training Compute-Optimal Large Language Models | Pending | Large-Language-Models, Transformers | Architecture, Optimization-No. of params, Pre-Training, Tips & Tricks | 2022 | arXiv | Jordan Hoffmann, Laurent Sifre, Oriol Vinyals, Sebastian Borgeaud | link | |
1 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
2 | Scaling Instruction-Finetuned Language Models (FLAN) | Pending | Generative, Large-Language-Models, Question-Answering, Text , Transformers | Instruction-Finetuning | 2022 | arXiv | Hyung Won Chung, Jason Wei, Jeffrey Dean, Le Hou, Quoc V. Le, Shayne Longpre | https://arxiv.org/abs/2210.11416 introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated. | link |
3 | ReAct: Synergizing Reasoning and Acting in Language Models | Pending | Generative, Large-Language-Models, Text | Optimizations, Tips & Tricks | 2023 | ICLR | Dian Yu, Izhak Shafran, Jeffrey Zhao, Karthik Narasimhan, Nan Du, Shunyu Yao, Yuan Cao | This paper introduces ReAct, a novel approach that leverages Large Language Models (LLMs) to interleave reasoning traces and task-specific actions. ReAct outperforms existing methods on various language and decision-making tasks, addressing issues like hallucination, error propagation, and improving human interpretability and trustworthiness. | link |
4 | Training language models to follow instructions with human feedback | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning, Reinforcement-Learning, Semi-Supervised | 2022 | arXiv | Carroll L. Wainwright, Diogo Almeida, Jan Leike, Jeff Wu, Long Ouyang, Pamela Mishkin, Paul Christiano, Ryan Lowe, Xu Jiang | This paper presents InstructGPT, a model fine-tuned with human feedback to better align with user intent across various tasks. Despite having significantly fewer parameters than larger models, InstructGPT outperforms them in human evaluations, demonstrating improved truthfulness, reduced toxicity, and minimal performance regressions on public NLP datasets, highlighting the potential of fine-tuning with human feedback for enhancing language model alignment with human intent. | link |
5 | Constitutional AI: Harmlessness from AI Feedback | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning, Reinforcement-Learning, Unsupervised | 2022 | arXiv | Jared Kaplan, Yuntao Ba | The paper introduces Constitutional AI, a method for training a safe AI assistant without human-labeled data on harmful outputs. It combines supervised learning and reinforcement learning phases, enabling the AI to engage with harmful queries by explaining its objections, thus improving control, transparency, and human-judged performance with minimal human oversight. | link |
6 | Self-Alignment with Instruction Backtranslation | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning | 2023 | arXiv | Jason Weston, Mike Lewis, Ping Yu, Xian Li | The paper introduces a scalable method called "instruction backtranslation" to create a high-quality instruction-following language model. This method involves self-augmentation and self-curation of training examples generated from web documents, resulting in a model that outperforms others in its category without relying on distillation data, showcasing its effective self-alignment capability. | link |
7 | Table-GPT: Table-tuned GPT for Diverse Table Tasks | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning | 2023 | arXiv | link | ||
8 | Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering | Pending | Large-Language-Models | Prompting, Tips & Tricks | 2024 | arXiv | Dedy Kredo, Itamar Friedman, Tal Ridnik | This paper introduces AlphaCodium, a novel test-based, multi-stage, code-oriented iterative approach for improving the performance of Language Model Models (LLMs) on code generation tasks. | link |
9 | Large Language Models for Data Annotation: A Survey | This week | Dataset, Generative, Large-Language-Models | Prompting, Tips & Tricks | 2024 | arXiv | Alimohammad Beigi, Zhen Tan | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Class-Balanced Loss Based on Effective Number of Samples | Pending | Loss Function | Tips & Tricks | 2019 | CVPR | Menglin Jia, Yin Cui | link | |
1 | WGAN: Wasserstein GAN | Pending | GANs, Loss Function | 2017 | arXiv | Léon Bottou, Martin Arjovsky, Soumith Chintala | link | ||
2 | Perceptual Losses for Real-Time Style Transfer and Super-Resolution | Pending | Loss Function, NNs | 2016 | ECCV | Alexandre Alahi, Justin Johnson, Li Fei-Fei | link | ||
3 | Topological Loss: Beyond the Pixel-Wise Loss for Topology-Aware Delineation | Pending | Image , Loss Function, Segmentation | 2018 | CVPR | Agata Mosinska, Mateusz Koziński, Pablo Márquez-Neila, Pascal Fua | link | ||
4 | Understanding Loss Functions in Computer Vision | Pending | CV , GANs, Image , Loss Function | Comparison, Tips & Tricks | 2020 | Blog | Sowmya Yellapragada | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Phrase-Based & Neural Unsupervised Machine Translation | Pending | NMT, Text , Transformers | Unsupervised | 2018 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
1 | Unsupervised Machine Translation Using Monolingual Corpora Only | Pending | GANs, NMT, Text , Transformers | Unsupervised | 2017 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
2 | Cross-lingual Language Model Pretraining | Pending | NMT, Text , Transformers | Unsupervised | 2019 | arXiv | Alexis Conneau, Guillaume Lample | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks | Read | NN Initialization, NNs | Optimization-No. of params, Tips & Tricks | 2019 | ICLR | Jonathan Frankle, Michael Carbin | Lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that—when trained in isolation— reach test accuracy comparable to the original network in a similar number of iterations. | link |
1 | All you need is a good init | Pending | NN Initialization | Tips & Tricks | 2015 | arXiv | Dmytro Mishkin, Jiri Matas | link | |
2 | Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask | Read | NN Initialization, NNs | Comparison, Optimization-No. of params, Tips & Tricks | 2019 | NeurIPS | Hattie Zhou, Janice Lan, Jason Yosinski, Rosanne Liu | Follow up on Lottery Ticket Hypothesis exploring the effects of different Masking criteria as well as Mask-1 and Mask-0 actions. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks | Read | NN Initialization, NNs | Optimization-No. of params, Tips & Tricks | 2019 | ICLR | Jonathan Frankle, Michael Carbin | Lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that—when trained in isolation— reach test accuracy comparable to the original network in a similar number of iterations. | link |
1 | How Does Batch Normalization Help Optimization? | Pending | NNs, Normalization | Optimizations | 2018 | arXiv | Aleksander Madry, Andrew Ilyas, Dimitris Tsipras, Shibani Santurkar | link | |
2 | Group Normalization | Pending | NNs, Normalization | Optimizations | 2018 | arXiv | Kaiming He, Yuxin Wu | link | |
3 | Perceptual Losses for Real-Time Style Transfer and Super-Resolution | Pending | Loss Function, NNs | 2016 | ECCV | Alexandre Alahi, Justin Johnson, Li Fei-Fei | link | ||
4 | NADAM: Incorporating Nesterov Momentum into Adam | Pending | NNs, Optimizers | Comparison | 2016 | Timothy Dozat | link | ||
5 | Deep Double Descent: Where Bigger Models and More Data Hurt | Pending | NNs | 2019 | arXiv | Boaz Barak, Gal Kaplun, Ilya Sutskever, Preetum Nakkiran, Tristan Yang, Yamini Bansal | link | ||
6 | Adam: A Method for Stochastic Optimization | Pending | NNs, Optimizers | 2015 | ICLR | Diederik P. Kingma, Jimmy Ba | link | ||
7 | Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask | Read | NN Initialization, NNs | Comparison, Optimization-No. of params, Tips & Tricks | 2019 | NeurIPS | Hattie Zhou, Janice Lan, Jason Yosinski, Rosanne Liu | Follow up on Lottery Ticket Hypothesis exploring the effects of different Masking criteria as well as Mask-1 and Mask-0 actions. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | How Does Batch Normalization Help Optimization? | Pending | NNs, Normalization | Optimizations | 2018 | arXiv | Aleksander Madry, Andrew Ilyas, Dimitris Tsipras, Shibani Santurkar | link | |
1 | Group Normalization | Pending | NNs, Normalization | Optimizations | 2018 | arXiv | Kaiming He, Yuxin Wu | link | |
2 | Spectral Normalization for GANs | Pending | GANs, Normalization | Optimizations | 2018 | arXiv | Masanori Koyama, Takeru Miyato, Toshiki Kataoka, Yuichi Yoshida | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | NADAM: Incorporating Nesterov Momentum into Adam | Pending | NNs, Optimizers | Comparison | 2016 | Timothy Dozat | link | ||
1 | Adam: A Method for Stochastic Optimization | Pending | NNs, Optimizers | 2015 | ICLR | Diederik P. Kingma, Jimmy Ba | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | MuZero: Mastering Go, chess, shogi and Atari without rules | Pending | Other | Reinforcement-Learning | 2020 | Nature | David Silver, Demis Hassabis, Ioannis Antonoglou, Julian Schrittwiese | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | A 2019 guide to Human Pose Estimation with Deep Learning | Pending | CV , Pose Estimation | Comparison | 2019 | Blog | Sudharshan Chandra Babu | link | |
1 | A Simple yet Effective Baseline for 3D Human Pose Estimation | Pending | CV , Pose Estimation | 2017 | ICCV | James J. Little, Javier Romero, Julieta Martinez, Rayat Hossain | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | SpanBERT: Improving Pre-training by Representing and Predicting Spans | Read | Question-Answering, Text , Transformers | Pre-Training | 2020 | TACL | Danqi Chen, Mandar Joshi | A different pre-training strategy for BERT model to improve performance for Question Answering task. | link |
1 | Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach | Read | Question-Answering, Text , Transformers | Zero-shot-learning | 2020 | KDD | Li Yang, Qifan Wang | Question Answering BERT model used to extract attributes from products. Introduce further No Answer loss and distillation to promote zero shot learning. | link |
2 | Chain of Thought Prompting Elicits Reasoning in Large Language Models | Pending | Question-Answering, Text , Transformers | 2022 | arXiv | Denny Zhou, Jason Wei, Xuezhi Wang | link | ||
3 | Large Language Models are Zero-Shot Reasoners | Pending | Generative, Question-Answering, Text | Tips & Tricks, Zero-shot-learning | 2022 | arXiv | Takeshi Kojima, Yusuke Iwasawa | link | |
4 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
5 | Scaling Instruction-Finetuned Language Models (FLAN) | Pending | Generative, Large-Language-Models, Question-Answering, Text , Transformers | Instruction-Finetuning | 2022 | arXiv | Hyung Won Chung, Jason Wei, Jeffrey Dean, Le Hou, Quoc V. Le, Shayne Longpre | https://arxiv.org/abs/2210.11416 introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Topological Loss: Beyond the Pixel-Wise Loss for Topology-Aware Delineation | Pending | Image , Loss Function, Segmentation | 2018 | CVPR | Agata Mosinska, Mateusz Koziński, Pablo Márquez-Neila, Pascal Fua | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Language-Agnostic BERT Sentence Embedding | Read | Attention, Siamese Network, Text , Transformers | Embeddings | 2020 | arXiv | Fangxiaoyu Feng, Yinfei Yang | A BERT model with multilingual sentence embeddings learned over 112 languages and Zero-shot learning over unseen languages. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Self-Normalizing Neural Networks | Pending | Activation Function, Tabular | Optimizations, Tips & Tricks | 2017 | NIPS | Andreas Mayr, Günter Klambauer, Thomas Unterthiner | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Attention is All you Need | Read | Attention, Text , Transformers | Architecture | 2017 | NIPS | Ashish Vaswani, Illia Polosukhin, Noam Shazeer, Łukasz Kaiser | Talks about Transformer architecture which brings SOTA performance for different tasks in NLP | link |
1 | GPT-2 (Language Models are Unsupervised Multitask Learners) | Pending | Attention, Text , Transformers | 2019 | Alec Radford, Dario Amodei, Ilya Sutskever, Jeffrey Wu | link | |||
2 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | Read | Attention, Text , Transformers | Embeddings | 2018 | NAACL | Jacob Devlin, Kenton Lee, Kristina Toutanova, Ming-Wei Chang | BERT is an extension to Transformer based architecture which introduces a masked word pretraining and next sentence prediction task to pretrain the model for a wide variety of tasks. | link |
3 | Single Headed Attention RNN: Stop Thinking With Your Head | Pending | Attention, LSTMs, Text | Optimization-No. of params | 2019 | arXiv | Stephen Merity | link | |
4 | Reformer: The Efficient Transformer | Read | Attention, Text , Transformers | Architecture, Optimization-Memory, Optimization-No. of params | 2020 | arXiv | Anselm Levskaya, Lukasz Kaiser, Nikita Kitaev | Overcome time and memory complexity of Transformers by bucketing Query, Keys and using Reversible residual connections. | link |
5 | Language-Agnostic BERT Sentence Embedding | Read | Attention, Siamese Network, Text , Transformers | Embeddings | 2020 | arXiv | Fangxiaoyu Feng, Yinfei Yang | A BERT model with multilingual sentence embeddings learned over 112 languages and Zero-shot learning over unseen languages. | link |
6 | Phrase-Based & Neural Unsupervised Machine Translation | Pending | NMT, Text , Transformers | Unsupervised | 2018 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
7 | Unsupervised Machine Translation Using Monolingual Corpora Only | Pending | GANs, NMT, Text , Transformers | Unsupervised | 2017 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
8 | Cross-lingual Language Model Pretraining | Pending | NMT, Text , Transformers | Unsupervised | 2019 | arXiv | Alexis Conneau, Guillaume Lample | link | |
9 | Word2Vec: Efficient Estimation of Word Representations in Vector Space | Pending | Text | Embeddings, Tips & Tricks | 2013 | arXiv | Greg Corrado, Jeffrey Dean, Kai Chen, Tomas Mikolov | link | |
10 | One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction | Pending | Image , Text | 2020 | arXiv | Jun Huang, Mengli Cheng, Minghui Qiu, Wei Lin, Xing Shi | link | ||
11 | ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning | Pending | AGI, Dataset, Text | 2019 | AAAI | Maarten Sap, Noah A. Smith, Ronan Le Bras, Yejin Choi | link | ||
12 | COMET: Commonsense Transformers for Automatic Knowledge Graph Construction | Pending | AGI, Text , Transformers | 2019 | ACL | Antoine Bosselut, Hannah Rashkin, Yejin Choi | link | ||
13 | VisualCOMET: Reasoning about the Dynamic Context of a Still Image | Pending | AGI, Dataset, Image , Text , Transformers | 2020 | ECCV | Ali Farhadi, Chandra Bhagavatula, Jae Sung Park, Yejin Choi | link | ||
14 | T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | Read | Attention, Text , Transformers | 2020 | JMLR | Colin Raffel, Noam Shazeer, Peter J. Liu, Wei Liu, Yanqi Zhou | Presents a Text-to-Text transformer model with multi-task learning capabilities, simultaneously solving problems such as machine translation, document summarization, question answering, and classification tasks. | link | |
15 | DALL·E: Creating Images from Text | Pending | Image , Text , Transformers | 2021 | Blog | Aditya Ramesh, Gabriel Goh, Ilya Sutskever, Mikhail Pavlov, Scott Gray | link | ||
16 | CLIP: Connecting Text and Images | Pending | Image , Text , Transformers | Multimodal, Pre-Training | 2021 | arXiv | Alec Radford, Ilya Sutskever, Jong Wook Kim | link | |
17 | Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision | This week | Image , Text , Transformers | Multimodal | 2020 | EMNLP | Hao Tan, Mohit Bansal | link | |
18 | SpanBERT: Improving Pre-training by Representing and Predicting Spans | Read | Question-Answering, Text , Transformers | Pre-Training | 2020 | TACL | Danqi Chen, Mandar Joshi | A different pre-training strategy for BERT model to improve performance for Question Answering task. | link |
19 | Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach | Read | Question-Answering, Text , Transformers | Zero-shot-learning | 2020 | KDD | Li Yang, Qifan Wang | Question Answering BERT model used to extract attributes from products. Introduce further No Answer loss and distillation to promote zero shot learning. | link |
20 | Interpreting Deep Learning Models in Natural Language Processing: A Review | Pending | Text | Comparison, Visualization | 2021 | arXiv | Diyi Yang, Xiaofei Sun | link | |
21 | Symbolic Knowledge Distillation: from General Language Models to Commonsense Models | Pending | Dataset, Text , Transformers | Optimizations, Tips & Tricks | 2021 | arXiv | Chandra Bhagavatula, Jack Hessel, Peter West, Yejin Choi | link | |
22 | Chain of Thought Prompting Elicits Reasoning in Large Language Models | Pending | Question-Answering, Text , Transformers | 2022 | arXiv | Denny Zhou, Jason Wei, Xuezhi Wang | link | ||
23 | Transforming Sequence Tagging Into A Seq2Seq Task | Pending | Generative, Text | Comparison, Tips & Tricks | 2022 | arXiv | Iftekhar Naim, Karthik Raman, Krishna Srinivasan | link | |
24 | Large Language Models are Zero-Shot Reasoners | Pending | Generative, Question-Answering, Text | Tips & Tricks, Zero-shot-learning | 2022 | arXiv | Takeshi Kojima, Yusuke Iwasawa | link | |
25 | Flan-T5: Scaling Instruction-Finetuned Language Models | Pending | Generative, Text , Transformers | Architecture, Pre-Training | 2022 | arXiv | Hyung Won Chung, Le Hou | link | |
26 | Decoding a Neural Retriever’s Latent Space for Query Suggestion | Pending | Text | Embeddings, Latent space | 2022 | arXiv | Christian Buck, Leonard Adolphs, Michelle Chen Huebscher | link | |
27 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
28 | Scaling Instruction-Finetuned Language Models (FLAN) | Pending | Generative, Large-Language-Models, Question-Answering, Text , Transformers | Instruction-Finetuning | 2022 | arXiv | Hyung Won Chung, Jason Wei, Jeffrey Dean, Le Hou, Quoc V. Le, Shayne Longpre | https://arxiv.org/abs/2210.11416 introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated. | link |
29 | ReAct: Synergizing Reasoning and Acting in Language Models | Pending | Generative, Large-Language-Models, Text | Optimizations, Tips & Tricks | 2023 | ICLR | Dian Yu, Izhak Shafran, Jeffrey Zhao, Karthik Narasimhan, Nan Du, Shunyu Yao, Yuan Cao | This paper introduces ReAct, a novel approach that leverages Large Language Models (LLMs) to interleave reasoning traces and task-specific actions. ReAct outperforms existing methods on various language and decision-making tasks, addressing issues like hallucination, error propagation, and improving human interpretability and trustworthiness. | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Training language models to follow instructions with human feedback | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning, Reinforcement-Learning, Semi-Supervised | 2022 | arXiv | Carroll L. Wainwright, Diogo Almeida, Jan Leike, Jeff Wu, Long Ouyang, Pamela Mishkin, Paul Christiano, Ryan Lowe, Xu Jiang | This paper presents InstructGPT, a model fine-tuned with human feedback to better align with user intent across various tasks. Despite having significantly fewer parameters than larger models, InstructGPT outperforms them in human evaluations, demonstrating improved truthfulness, reduced toxicity, and minimal performance regressions on public NLP datasets, highlighting the potential of fine-tuning with human feedback for enhancing language model alignment with human intent. | link |
1 | Constitutional AI: Harmlessness from AI Feedback | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning, Reinforcement-Learning, Unsupervised | 2022 | arXiv | Jared Kaplan, Yuntao Ba | The paper introduces Constitutional AI, a method for training a safe AI assistant without human-labeled data on harmful outputs. It combines supervised learning and reinforcement learning phases, enabling the AI to engage with harmful queries by explaining its objections, thus improving control, transparency, and human-judged performance with minimal human oversight. | link |
2 | Self-Alignment with Instruction Backtranslation | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning | 2023 | arXiv | Jason Weston, Mike Lewis, Ping Yu, Xian Li | The paper introduces a scalable method called "instruction backtranslation" to create a high-quality instruction-following language model. This method involves self-augmentation and self-curation of training examples generated from web documents, resulting in a model that outperforms others in its category without relying on distillation data, showcasing its effective self-alignment capability. | link |
3 | Table-GPT: Table-tuned GPT for Diverse Table Tasks | Pending | Generative, Large-Language-Models, Training Method | Instruction-Finetuning | 2023 | arXiv | link |
Paper Name | Status | Topic | Category | Year | Conference | Author | Summary | Link | |
---|---|---|---|---|---|---|---|---|---|
0 | Attention is All you Need | Read | Attention, Text , Transformers | Architecture | 2017 | NIPS | Ashish Vaswani, Illia Polosukhin, Noam Shazeer, Łukasz Kaiser | Talks about Transformer architecture which brings SOTA performance for different tasks in NLP | link |
1 | GPT-2 (Language Models are Unsupervised Multitask Learners) | Pending | Attention, Text , Transformers | 2019 | Alec Radford, Dario Amodei, Ilya Sutskever, Jeffrey Wu | link | |||
2 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | Read | Attention, Text , Transformers | Embeddings | 2018 | NAACL | Jacob Devlin, Kenton Lee, Kristina Toutanova, Ming-Wei Chang | BERT is an extension to Transformer based architecture which introduces a masked word pretraining and next sentence prediction task to pretrain the model for a wide variety of tasks. | link |
3 | Reformer: The Efficient Transformer | Read | Attention, Text , Transformers | Architecture, Optimization-Memory, Optimization-No. of params | 2020 | arXiv | Anselm Levskaya, Lukasz Kaiser, Nikita Kitaev | Overcome time and memory complexity of Transformers by bucketing Query, Keys and using Reversible residual connections. | link |
4 | Language-Agnostic BERT Sentence Embedding | Read | Attention, Siamese Network, Text , Transformers | Embeddings | 2020 | arXiv | Fangxiaoyu Feng, Yinfei Yang | A BERT model with multilingual sentence embeddings learned over 112 languages and Zero-shot learning over unseen languages. | link |
5 | Phrase-Based & Neural Unsupervised Machine Translation | Pending | NMT, Text , Transformers | Unsupervised | 2018 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
6 | Unsupervised Machine Translation Using Monolingual Corpora Only | Pending | GANs, NMT, Text , Transformers | Unsupervised | 2017 | arXiv | Alexis Conneau, Guillaume Lample, Ludovic Denoyer, Marc'Aurelio Ranzato, Myle Ott | link | |
7 | Cross-lingual Language Model Pretraining | Pending | NMT, Text , Transformers | Unsupervised | 2019 | arXiv | Alexis Conneau, Guillaume Lample | link | |
8 | COMET: Commonsense Transformers for Automatic Knowledge Graph Construction | Pending | AGI, Text , Transformers | 2019 | ACL | Antoine Bosselut, Hannah Rashkin, Yejin Choi | link | ||
9 | VisualCOMET: Reasoning about the Dynamic Context of a Still Image | Pending | AGI, Dataset, Image , Text , Transformers | 2020 | ECCV | Ali Farhadi, Chandra Bhagavatula, Jae Sung Park, Yejin Choi | link | ||
10 | T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | Read | Attention, Text , Transformers | 2020 | JMLR | Colin Raffel, Noam Shazeer, Peter J. Liu, Wei Liu, Yanqi Zhou | Presents a Text-to-Text transformer model with multi-task learning capabilities, simultaneously solving problems such as machine translation, document summarization, question answering, and classification tasks. | link | |
11 | GPT-f: Generative Language Modeling for Automated Theorem Proving | Pending | Attention, Transformers | 2020 | arXiv | Ilya Sutskever, Stanislas Polu | link | ||
12 | Vision Transformer: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Pending | Attention, Image , Transformers | 2021 | ICLR | Alexey Dosovitskiy, Jakob Uszkoreit, Lucas Beyer, Neil Houlsby | link | ||
13 | DALL·E: Creating Images from Text | Pending | Image , Text , Transformers | 2021 | Blog | Aditya Ramesh, Gabriel Goh, Ilya Sutskever, Mikhail Pavlov, Scott Gray | link | ||
14 | CLIP: Connecting Text and Images | Pending | Image , Text , Transformers | Multimodal, Pre-Training | 2021 | arXiv | Alec Radford, Ilya Sutskever, Jong Wook Kim | link | |
15 | Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision | This week | Image , Text , Transformers | Multimodal | 2020 | EMNLP | Hao Tan, Mohit Bansal | link | |
16 | SpanBERT: Improving Pre-training by Representing and Predicting Spans | Read | Question-Answering, Text , Transformers | Pre-Training | 2020 | TACL | Danqi Chen, Mandar Joshi | A different pre-training strategy for BERT model to improve performance for Question Answering task. | link |
17 | Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach | Read | Question-Answering, Text , Transformers | Zero-shot-learning | 2020 | KDD | Li Yang, Qifan Wang | Question Answering BERT model used to extract attributes from products. Introduce further No Answer loss and distillation to promote zero shot learning. | link |
18 | TransGAN: Two Transformers Can Make One Strong GAN | Pending | GANs, Image , Transformers | Architecture | 2021 | arXiv | Shiyu Chang, Yifan Jiang, Zhangyang Wang | link | |
19 | Symbolic Knowledge Distillation: from General Language Models to Commonsense Models | Pending | Dataset, Text , Transformers | Optimizations, Tips & Tricks | 2021 | arXiv | Chandra Bhagavatula, Jack Hessel, Peter West, Yejin Choi | link | |
20 | Chain of Thought Prompting Elicits Reasoning in Large Language Models | Pending | Question-Answering, Text , Transformers | 2022 | arXiv | Denny Zhou, Jason Wei, Xuezhi Wang | link | ||
21 | Flan-T5: Scaling Instruction-Finetuned Language Models | Pending | Generative, Text , Transformers | Architecture, Pre-Training | 2022 | arXiv | Hyung Won Chung, Le Hou | link | |
22 | Training Compute-Optimal Large Language Models | Pending | Large-Language-Models, Transformers | Architecture, Optimization-No. of params, Pre-Training, Tips & Tricks | 2022 | arXiv | Jordan Hoffmann, Laurent Sifre, Oriol Vinyals, Sebastian Borgeaud | link | |
23 | VL-T5: Unifying Vision-and-Language Tasks via Text Generation | Read | CNNs, CV , Generative, Image , Large-Language-Models, Question-Answering, Text , Transformers | Architecture, Embeddings, Multimodal, Pre-Training | 2021 | arXiv | Hao Tan, Jaemin Cho, Jie Le, Mohit Bansal | Unifying two modalities (image and text) together in a single transformer model to solve multiple tasks in a single architecture using text prefixes similar to T5. | link |
24 | Scaling Instruction-Finetuned Language Models (FLAN) | Pending | Generative, Large-Language-Models, Question-Answering, Text , Transformers | Instruction-Finetuning | 2022 | arXiv | Hyung Won Chung, Jason Wei, Jeffrey Dean, Le Hou, Quoc V. Le, Shayne Longpre | https://arxiv.org/abs/2210.11416 introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated. | link |