Transformer Architecture Reimplementation

This project implements the Transformer Architecture from Attention is All You Need. The Transformer architecture has significantly impacted sequence-to-sequence learning by introducing an attention-based mechanism, replacing traditional recurrent and convolutional neural network structures. This approach has demonstrated remarkable effectiveness in various tasks including machine translation, language modeling, and text generation.

Installation

To install the necessary dependencies, run:

pip install -r requirements.txt

Training the Model

To train the model, execute the following command:

python train.py

During training, you will observe predictions from the model after each epoch:

Processing epoch 27: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 158/158 [00:51<00:00,  3.10it/s, loss=5.873]
--------------------------------------------------------------------------------
SOURCE: I'm sure I don't want to stay in here any longer!'
TARGET: Eu só sei que não quero mais ficar aqui!'
PREDICTED: Eu só sei que não quero mais ficar aqui!'

Dataset

This implementation uses the Opus books dataset, a collection of copyright-free books aligned by Andras Farkas. The default translation direction is from English to Italian, but you can modify this setting in config.py to translate between any languages of your choice.

References

GitHub - pytorch-transformer

If you use this code in your research or find it helpful, please consider citing the original paper:

@article{vaswani2017attention,
    title   = {Attention is all you need},
    author  = {Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser,  Lukasz and Polosukhin, Illia},
    journal = {Advances in neural information processing systems},
    volume  = {30},
    year    = {2017}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
img		img
.gitignore		.gitignore
README.md		README.md
architecture_details.ipynb		architecture_details.ipynb
attention_visual.ipynb		attention_visual.ipynb
config.py		config.py
dataset.py		dataset.py
inference.ipynb		inference.ipynb
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer Architecture Reimplementation

Installation

Training the Model

Dataset

References

About

Releases

Packages

Languages

engichang1467/Transformer-Reimplementation

Folders and files

Latest commit

History

Repository files navigation

Transformer Architecture Reimplementation

Installation

Training the Model

Dataset

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages