Augmenting character level Transformers with Causal Dilated Conv1D layers
You do first need to download a parquet file, and either rename it to 2013.parquet or adjust the file names in make_vocab.py and charformer_main.py
make_vocab.py - create vocab, run this first
charformer_main.py - train model
run_charformer.py - load and demo model
charformer_model.py - utils and model file
Note: this has nothing to do with the previous architectures developed with the same name.