CUB Large Scale Deep Learning Models, Fall 2024

The "Large Scale Deep Learning Models" course focuses on the methodologies and techniques used to train large models on extensive datasets across various data domains, including images, text, and audio. The course provides in-depth coverage of self-supervised learning approaches, which have become crucial for leveraging vast amounts of unlabeled data. Topics include data preprocessing and augmentation for different modalities, architectural considerations for scaling deep learning models, and strategies for distributed and parallel training.

Instructors: Alexander Shabalin, Ildus Sadrtdinov, Dmitry Kropotov

Classes: on Mondays offline in the classroom EH-4 in time slots 14:15 - 15:30 and 15:45 - 17:00

Telegram chat for questions and discussion: link

Practical assignments: all asssignments are given and checked in the corresponding Teams space. If you don't have access to Teams space, please write directly to one of the instructors or in the course chat.

Course assessment criteria

Assessment Component 1: written examination, Duration: 60 min, Weight: 50 %

Assessment Component 2: programming assignments, Weight: 50 %

Completion: To pass this module, the examination of each module component must be passed with at least 45%.

Lectures

Date	Number	Topic	Materials
09.09.24	01	Introduction to the course. Large models, large datasets and self-supervised learning. What to do with a pretrained model? Linear probing, Fine-tuning, in-distribution (ID) and out-of-distribution (OOD) performance. CLIP model, Zero-shot and WiSE-FT (robust weights ensemble).	Fine-tuning distorts features, Comparing pre-training algorithms, CLIP, WiSE-FT, Do ImageNet Classifiers Generalize to ImageNet?
23.09.24	02	Classical pretext tasks for images: inpainting, colorization, jigsaw puzzles	Exemplar, Context Prediction, Inpainting, Jigsaw Puzzles, Colorization, Rotations, Damaged Jigsaw Puzzles, Task Ensemble
30.09.24	03	Modern architectures for images: ViT, DeiT, MLP Mixer, Swin, ConvNeXt, Neighborhood Attention Transformer (NAT). Efficient training & inference: Automatic Mixed Precision (AMP), Data-Parallel and Model-Parallel training	Big Transfer, ViT, DeiT, MLP Mixer, Swin, ConvNeXt, NAT
07.10.24	04	Contrastive learning for images. Mutual information, SimCLR, MoCo, BYOL, SimSiam, DeepCluster, SwAV. Deriving contrastive loss	SimCLR, MoCo, BYOL, SimSiam, DeepCluster, SwAV
14.10.24	05	Self-supervised learning for ViT. Masked image modeling. DINO, BEiT, MAE, MaskFeat. Different approaches for improving contrastive learning.	DINO, BEiT, MAE, MaskFeat Dense CL, Supervised CL, DiLo, LooC
21.10.24	06	Mode connectivity and Linear mode connectivity (LMC). Ensembling: Deep Ensemble (DE), SSE, FGE, cSGLD, KFAC-Laplace, SWAG, SPRO, StarSSE. Model averaging: SWA, model soups. Weight averaging in optimization: Lookahead, Lookaround, WATT	LMC, LMC in transfer learning, DE, DE and loss landscape, DE and distribution shifts, SSE, FGE, cSGLD, KFAC-Laplace, SWAG, SPRO, DE Equivalent, StarSSE, SWA, model soups, Lookahead, Lookaround, WATT
28.10.24	07	Pruning, Quantization, Distillation
04.11.24	08	Modern architectures for texts. Recap of transformers. Modern architectures. Transformer training tricks. What LLMs learn?
11.11.24	09	Parameter-Efficient Fine-tuning. GPT zero-shot, Prompt Tuning, Adapters, LoRA, QLoRA
18.11.24	10	Retrieval Augmented Generation
25.11.24	11	Reinforcement Learning with Human Feedback
02.12.24	12	Self-supervised learning for audio. Introduction to audio processing. CPC, Wav2Vec 2.0, HUBERT, Multi-format contrastive learning, BYOL-A.
07.12.24	13	Self-supervised learning for graphs. Intro to representation learning on graphs. Review of approaches.

Home assignments

Number	Release date	Deadline	Topic
01	15.09.24	01.10.24 23:59	Robust fine-tuning of CLIP
02	01.10.24	18.10.24 23:59	Classical pre-text tasks
03	18.10.24	06.11.24 23:59	Contrastive learning

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
week01-finetune		week01-finetune
week02-pretext		week02-pretext
week03-cv-archs		week03-cv-archs
week04-contrastive		week04-contrastive
week05-masked-imade-modeling		week05-masked-imade-modeling
week06-ensembling		week06-ensembling
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUB Large Scale Deep Learning Models, Fall 2024

Course assessment criteria

Lectures

Home assignments

About

Releases

Packages

Contributors 3

Languages

License

isadrtdinov/lsdl-cub

Folders and files

Latest commit

History

Repository files navigation

CUB Large Scale Deep Learning Models, Fall 2024

Course assessment criteria

Lectures

Home assignments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages