my_language_model

Overview

There are my sample code of natural language generative models, and their objective is translation, usually from chinese to english. This is an interesting topic since the debate of MLE and GAN on NLP in Caccia+18.

Model and algorithm

I use Transformer in Vaswani+17 as my model, and the objective is maximize likelihood estimation(MLE).

Preprocessing

Before the training start, all sentences will be encoded as sequences of one-hot encoding. The length is limited by a certain number.

Pretraining

I use self-supervised learning with objective of MLE as pretraining. Self-supervised learning can make a use of monolingual sentences, which is usually cheaper than biligual sentences. Therefore, we can collect cheaper data for training.

Experiments

I also implement other algorithms into my model as well as note on their name.

In the notebook named after 'LSTM', I replace Transformer by Long Short-Term Memory (LSTM) RNN.
In the notebook named after 'seqGAN', I use both MLE and Sequential Generative Adversarial Networks(seqGAN in Yu+16).
In the notebook named after 'TayPO', I use the concept of Taylor series to generalize the Proximal Policy Optimization and then implement it into my seqGAN (Tang+20).
In the notebook named after 'radam', I replace the optimizor Adam by RectifiedAdam in Liu+19
In the notebook named after 'slowMLE', I construct a model with two-stage learning rate, 10^-3 and 10^-5.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

my_language_model

Overview

Model and algorithm

Preprocessing

Pretraining

Experiments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
20210812_translate_mle.ipynb		20210812_translate_mle.ipynb
20210812_wmt19_cs_en.ipynb		20210812_wmt19_cs_en.ipynb
20210814_translate_rmMLE.ipynb		20210814_translate_rmMLE.ipynb
20210904_en_NLG.ipynb		20210904_en_NLG.ipynb
20210904_en_classification.ipynb		20210904_en_classification.ipynb
20210925_translate.ipynb		20210925_translate.ipynb
20210925_translate_en_zh.ipynb		20210925_translate_en_zh.ipynb
20210925_wmt19_cs_en_preprocessing.ipynb		20210925_wmt19_cs_en_preprocessing.ipynb
20210925_wmt19_en_zh_preprocessing.ipynb		20210925_wmt19_en_zh_preprocessing.ipynb
20211028_translate_en_zh_pretraining.ipynb		20211028_translate_en_zh_pretraining.ipynb
20211028_wmt19_en_zh_preprocessing.ipynb		20211028_wmt19_en_zh_preprocessing.ipynb
20211101_translate_en_zh.ipynb		20211101_translate_en_zh.ipynb
20211115_translate_en_zh.ipynb		20211115_translate_en_zh.ipynb
20211115_translate_en_zh_seqGAN.ipynb		20211115_translate_en_zh_seqGAN.ipynb
20211116_translate_en_zh.ipynb		20211116_translate_en_zh.ipynb
20211116_wmt19_en_zh_preprocessing.ipynb		20211116_wmt19_en_zh_preprocessing.ipynb
20211118_translate_en_zh_seqGAN.ipynb		20211118_translate_en_zh_seqGAN.ipynb
20211119_translate_en_zh_slowMLE.ipynb		20211119_translate_en_zh_slowMLE.ipynb
20211122_translate_en_zh_lstm.ipynb		20211122_translate_en_zh_lstm.ipynb
20211122_translate_en_zh_seqGAN_lstm.ipynb		20211122_translate_en_zh_seqGAN_lstm.ipynb
20211122_translate_en_zh_slowMLE.ipynb		20211122_translate_en_zh_slowMLE.ipynb
20211123_translate_en_zh_lstm.ipynb		20211123_translate_en_zh_lstm.ipynb
20211123_translate_en_zh_seqGAN_lstm.ipynb		20211123_translate_en_zh_seqGAN_lstm.ipynb
20211124_translate_en_zh_TayPO.ipynb		20211124_translate_en_zh_TayPO.ipynb
20220421_translate_en_zh_radam.ipynb		20220421_translate_en_zh_radam.ipynb
README.md		README.md
news-commentary-v14.cs-en.tsv		news-commentary-v14.cs-en.tsv
news-commentary-v14.en-zh.tsv		news-commentary-v14.en-zh.tsv

jacob975/my_language_model

Folders and files

Latest commit

History

Repository files navigation

my_language_model

Overview

Model and algorithm

Preprocessing

Pretraining

Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages