LLMs

LLMs, what are they? how do they work?

This repo contains my dive into llms. The notebooks are based on some of the big trends in the 2022-2023 AI/LLM boom.

I try and cover:

Basics: A background on LLMs
Math: Some of the basic background and useful functions in ai
The transformer: The base architecture of modern llms
Transformer Models: Introduction to some popular transformer models and a deep dive into GPT
Finetuning: Improve your models on your data (hopefully)
Inference: You've trained your model, now what.
Retrieval Augmented Generation (RAG): Context engineering
Agents: Models on a loop

If you want to run the notebooks you should create a models directory and download the wanted models. I will be using a variety of different models (usually the lower parameter models)

TODO: parts on history, neurosymbolic models, other extr aextra

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LLMs

Files

README.md

Latest commit

History

README.md

File metadata and controls

LLMs