Original Flan (2021) | The Flan Collection (2022) | Flan 2021 Citation | License
This repository contains code to generate instruction tuning dataset collections. The first is the original Flan 2021, documented in Finetuned Language Models are Zero-Shot Learners, and the second is the expanded version, called the Flan Collection, described in The Flan Collection: Designing Data and Methods for Effective Instruction Tuning and used to produce Flan-T5 and Flan-PaLM.
To generate the Flan 2021 data as Seqio mixtures, first install the relevant requirements.txt
then use mixtures.py.
Please cite the following if you found Flan 2021 useful in your research.
@inproceedings{weifinetuned,
title={Finetuned Language Models are Zero-Shot Learners},
author={Wei, Jason and Bosma, Maarten and Zhao, Vincent and Guu, Kelvin and Yu, Adams Wei and Lester, Brian and Du, Nan and Dai, Andrew M and Le, Quoc V},
booktitle={International Conference on Learning Representations}
}
The code in this repository is licensed according to the LICENSE file.
To contact us feel free to create an Issue in this repository, or email the respective authors that contributed to this code base: Jason Wei for the Flan 2021 paper, Le Hou for the Scaling Flan paper, and Shayne Longpre for the Flan Collection.