Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

First download and preprocess the data following the main language modeling README.

Then to train a convolutional LM using the fconv_lm_dauphin_wikitext103 architecture:

fairseq-train --task language_modeling \
    data-bin/wikitext-103 \
    --save-dir checkpoints/fconv_wikitext-103 \
    --arch fconv_lm_dauphin_wikitext103 \
    --max-epoch 35 \ --optimizer nag \
    --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 \
    --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion adaptive_loss \
    --adaptive-softmax-cutoff 10000,20000,200000 --max-tokens 1024 --tokens-per-sample 1024 \
    --ddp-backend=no_c10d

And evaluate with:

fairseq-eval-lm data-bin/wikitext-103 --path checkpoints/fconv_wiki103/checkpoint_best.pt

Citation

@inproceedings{dauphin2017language,
  title={Language Modeling with Gated Convolutional Networks},
  author={Dauphin, Yann N and Fan, Angela and Auli, Michael and Grangier, David},
  booktitle={Proceedings of the 34th International Conference on Machine Learning-Volume 70},
  pages={933--941},
  year={2017},
  organization={JMLR}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.conv.md

README.conv.md

Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

Citation

Files

README.conv.md

Latest commit

History

README.conv.md

File metadata and controls

Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

Citation