Image Courtesy: Some LLM
In the anatomy of any modern deep learning framework, you'll find a few major components:
- A library for neural network layers
- A set of data loading utilities
- A shim to support different accelerated backends
- An Automatic Differentiation Engine to tweak you network params and minimize your losses
I found a plenty of resources around the first three but while trying to understand automatic differentiation tools, I found myself struggling to find a curated set of resources in one place. So this repository is meant to be the resource that I wish I had when I started.
Disclaimer: This repo might be little biased towards Python related content since that is my language of choice. Julia also seems to have a strong community around automatic differentiation tools, I'll add more content around it soon.
-
PyCon US: Colin Carroll - Getting started with automatic differentiation - One of the very few talks about this topic in PyCon. Colin Carroll gives a very nice high-level overview for beginners and provides a few examples in Tensorflow, PyTorch and JAX.
-
What is Automatic Differentiation? - This is the video that I highly recommend beginning with to understand auto diff. Ari does a brilliant job of breaking down different types of differentiations and then goes deeper into Automatic Differentation. Made in a 3Bblue1Brown style animated video, with a lot of good reference links in the description.
-
DLSys Course from CMU - This is an amazing course by CMU that takes you through building an entire deep learning framework from scratch in Python. Lecture 4 & 5 are focussed on Automatic Differentation.
-
How to Differentiate with a Computer - This article by the American Mathematical Society goes a little deeper into the maths behind autodiff.
-
Tutorial on Automatic Differentiation - Prof. Matt Yedlin takes the above article by AMS and explains it in detail in this Youtube video.
-
PyTorch focussed tutorials -
Autograd
is PyTorch's underlying automatic differentiation engine. These tutorials/articles are either solely focussed on implemention using PyTorch or are from the official documentation.-
PyTorch Autograd Explained - In-depth Tutorial - This video is featured in the official Pytorch docs. Elliot does an amazing job of visually explaining the working of autograd in details.
-
The Fundamentals of Autograd - Official docs covering the basic aspects of autograd.
-
Automatic Differentiation with torch.autograd - Another beginner level official tutorial on autograd.
-
A Gentle Introduction to torch.autograd - Yet another beginner level official tutorial on autograd.
-
Autograd mechanics - Another official doc that goes deeper into the inner workings of autograd engine.
-
Simple Grad - Colab notebook from official PyTorch documentation showcasing reverse mode AD implementation.
-
-
Tensorflow focussed tutorials - TensorFlow uses GradientTape as the automatic differentiation engine.
-
Introduction to gradients and automatic differentiation - Basic official tutorial about using GradientTape.
-
Advanced automatic differentiation - This official tutorial covers advanced and less common features of the GradientTape API.
-
-
JAX - JAX is comparatively new Python library by Google for accelerator-oriented array computation and program transformation, designed for high-performance numerical computing and large-scale machine learning. They also provide automatic differenetiation capabilities out of the box.
-
Automatic differentiation - Beginner official JAX tutorial on automatic differentiation.
-
Advanced automatic differentiation - Advanced tutorial covering higher-order derivatives, Hessians, Jacobians, etc.
-
The Autodiff Cookbook - As the title suggests, this provides small code samples for different applications.
-
-
Automatic Differentiation in Machine Learning: a Survey - A survey paper published in JMLR 2018 which is one of the top peer-reviewed journals in machine learning. This cover what AD is and what it is not, different AD modes and how it converged into the field of machine learning.
-
Tangent: Automatic Differentiation Using Source Code Transformation in Python - Published in 2017, Tangent is a library by google that performs automatic differentiation using source code transformation. This work compares the working of Tangent with Tensorflow and HIPS Aurograd which used Tracing for AD.
-
Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming - Another paper by Google Brain that is published in NeurIPS 2018 which is another high rated conference in machine learning and AI.
-
Automatic Functional Differentiation in JAX - Published in ICLR 2024 by SEA AI Lab, this work extends JAX to add capabilities to automatically differentiate higher-order functions.
-
Automatic differentiation in PyTorch - Published in NIPS 2017 by Facebook AI Research, University of Oxford and University of Warsaw, this is a short paper covering the design and implementation of PyTorch's autograd module.
-
A Benchmark of Selected Algorithmic Differentiation Tools on Some Problems in Computer Vision and Machine Learning - This paper cover the benchmarking of various AD tools written in C++, Python, Jilua & Matlab.
-
Publications on autodiff.org - A set of conference papers and journal articles.
I love slow media, the slower the better. Even though the tech changes rapidly everyday, tech books always had a special place in my life. This section describes some of the books I've found around AD.
-
Architecture of Advanced Numerical Analysis Systems - "Designing a Scientific Computing System using OCaml" is in the title of the book. Although not entirely focussed on Automatic Differentiation, the author of Owl (OCaml based numerical computing library) takes the readers through building such a system. This includes writing algorithmic differentiation engine, performance accelerators, compiler backends for it and so on.
-
Books listing on autodiff.org - This website hosts a list of published papers and books around automatic differentiation.
- EnzymeCon - EnzymeAD is an automatic differentiation tool that can take code as LLVM IR and differentiate it. EnzymeCon is an annual conference around it.