LLMs, what are they? how do they work?
This repo contains my dive into llms. The notebooks are based on some of the big trends in the 2022-2023 AI/LLM boom.
I try and cover:
- Basics: A background on LLMs
- Math: Some of the basic background and useful functions in ai
- The transformer: The base architecture of modern llms
- Transformer Models: Introduction to some popular transformer models and a deep dive into GPT
- Finetuning: Improve your models on your data (hopefully)
- Inference: You've trained your model, now what.
- Retrieval Augmented Generation (RAG): Context engineering
- Agents: Models on a loop
If you want to run the notebooks you should create a models
directory and download the wanted models.
I will be using a variety of different models (usually the lower parameter models)
TODO: parts on history, neurosymbolic models, other extr aextra