Status: Pending
Author: Hyung Won Chung, Jason Wei, Jeffrey Dean, Le Hou, Quoc V. Le, Shayne Longpre
Topic: Generative, Large-Language-Models, Question-Answering, Text , Transformers
Category: Instruction-Finetuning
Conference: arXiv
Year: 2022
Link: https://arxiv.org/abs/2210.11416
Summary: https://arxiv.org/abs/2210.11416 introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated.