Experiments about some transformer architectures Vanilla Transformer Encoder Classifier GPT-2 Generator Mixtral - Mixture of Experts Generator