Intro to ML Safety course notes

This repository contains notes for the Intro to ML Safety course.

Currently, the notes are not yet complete. We are looking for volunteers who will help us finish them. Ideally, notes will present the information from lectures and readings in a different way, so that students can have multiple angles of looking at the same material. Notes shouldn't just be notes from the lectures and ideally will include citations to papers.

If you would like to contribute to the course notes, feel free to make a pull request! We will credit you here and in the course notes.

Some prelimary notes on some of the topics already exist, but they aren't complete.

Lecture	Status	Contributor(s)
Introduction	Not started
Deep Learning Review	Ready for Review	Nathaniel Li
Risk Decomposition	Ready for Review	Cody Rushing
Accident Models	Not started
Black Swans	Not started
Adversarial Robustness	Needs revision	Oliver Zhang
Black Swan Robustness	Needs revision	Oliver Zhang
Anomaly Detection	Needs revision	Oliver Zhang
Interpretable Uncertainty	Needs revision	Oliver Zhang
Transparency	Ready for Review	Cody Rushing
Trojans	Ready for Review	Ethan Gutierrez
Detecting Emergent Behaviour	Ready for Review	Bilal Chughtai
Honest Models	Not started
Intrasystem Goals or Power Aversion	Not started
Machine Ethics	Not started
ML for Improved Decision-Making	Ready for Review	Nathaniel Li
ML for Cyberdefense	Not started
Cooperative AI	Ready for Review	Bilal Chughtai
X-Risk	Not started
Possible Existential Hazards	Not started
Safety-Capabilities Balance	Not started
Review and Conclusion	Not started

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro to ML Safety course notes

About

Releases

Packages

Contributors 11

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
Adversarial Robustness		Adversarial Robustness
Anomaly Detection		Anomaly Detection
Black Swan Robustness		Black Swan Robustness
Black Swans		Black Swans
Cooperative AI		Cooperative AI
Deep Learning Review		Deep Learning Review
Detecting Emergent Behaviour		Detecting Emergent Behaviour
Interpretable Uncertainty		Interpretable Uncertainty
ML for Improved Decision-Making		ML for Improved Decision-Making
Risk Decomposition		Risk Decomposition
Transparency		Transparency
Trojans		Trojans
X-Risk Overview		X-Risk Overview
.gitignore		.gitignore
README.md		README.md

centerforaisafety/Intro_to_ML_Safety

Folders and files

Latest commit

History

Repository files navigation

Intro to ML Safety course notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Packages