ShakespeareGPT

building & training GPT from scratch based off of Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out. tutorial

dataset tiny-shakespeare : original with slight modifications.

tutorialGPT (following the video)

basic_bigramLM.py : built a basic bigram model with generate to get things rolling.
tutorial.ipynb : understood basic attention mechanism, using tril, masked_fill, softmax + notes on attention.
LMwithAttention.py : continued the model but now with single attention head, token embeddings, positional embeddings.
AttentionBlock.py : built a single attention head
LM_multihead_attention_ffwd.ipynb : continued the model to now have multiple attention heads concantenated, and a separate feed forward layer before lm_head.
tutorialGPT.ipynb : created the transformer block, layering, residual connections, better loss evaluation, dropout, layernorm.

Character Level GPT

used a character level tokenizer. Trained two versions with different configurations to better understand the impact of the hyperparameters such as n_embeds, num_heads.

Try on Kaggle

v1:
v2:

ShakespeareGPT

used a byte-pair encoding tokenizer.

Try on Kaggle

gpt.py : the full GPT model
dataset.py : torch dataset
build_tokenizer.py : BPE tokenizer using huggingface tokenizers from scratch similar to GPT-2 saved at tokenizer
train.py : training script contains optimizer, config, loss function, train loop, validation loop, model saving
generate.py : generate text by loading the model on CPU.

Versions

  V1
  n_embed = 384
  n_heads = 12
  head_size = 32
  n_layers = 4
  lr = 6e-4
  attn_dropout = 0.1
  block_dropout = 0.1

  Train Loss: 4.020419597625732
  Valid Loss: 6.213085174560547

  V2
  n_embed = 384
  n_heads = 6
  head_size = 64
  n_layers = 3
  lr = 5e-4
  attn_dropout = 0.2
  block_dropout = 0.2

  Train Loss: 3.933095216751099 
  Valid Loss: 5.970513820648193

as always, an incredible tutorial by Andrej!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ShakespeareGPT

dataset tiny-shakespeare : original with slight modifications.

tutorialGPT (following the video)

Character Level GPT

Try on Kaggle

ShakespeareGPT

Try on Kaggle

Versions

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
character_level_GPT		character_level_GPT
data		data
saved		saved
tokenizer		tokenizer
tutorialGPT		tutorialGPT
.gitignore		.gitignore
README.md		README.md
build_tokenizer.py		build_tokenizer.py
dataset.py		dataset.py
generate.py		generate.py
generated.txt		generated.txt
gpt.py		gpt.py
train.py		train.py

shreydan/shakespeareGPT

Folders and files

Latest commit

History

Repository files navigation

ShakespeareGPT

dataset tiny-shakespeare : original with slight modifications.

tutorialGPT (following the video)

Character Level GPT

Try on Kaggle

ShakespeareGPT

Try on Kaggle

Versions

About

Topics

Resources

Stars

Watchers

Forks

Languages