Reading List

Repository of reading lists on overview of AI Safety, Safe RL, and topics under the hood, along with selected paper summaries, and corresponding links

Risk, Transparency, Explainability

A Comprehensive Survey on Safe Reinforcement Learning [Paper] [Summary]
- Javier Garcia, Fernando Fernandez, JMLR, 2015
Should Robots be Obedient? [Paper]
- Smitha Milli, Dylan Hadfield-Menell, Anca Dragan, Stuart Russell
Enabling Robots to Communicate their Objectives [Paper]
- Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan
Safe Model-based Reinforcement Learning with Stability Guarantees [Paper]
- Felix Berkenkamp, Matteo Turchetta, Angela Schoellig, Andreas Krause, NIPS 2017
On ensuring that machines are well behaved [Paper] [Summary]
- Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, and Emma Brunskill
Safe Exploration in Markov Decision Processes [Paper]
- Teodor Mihai Moldovan, Pieter Abbeel, ICML, 2012
The Off-Switch Game [Paper]
- Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, Stuart Russell
Concrete Problems in AI Safety [Paper]
- Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Man´e
Constrained Policy Optimization [Paper]
- Joshua Achiam, David Held, Aviv Tamar, Pieter Abbeel
Probabilistically Safe Policy Transfer [Paper]
- David Held, Zoe McCarthy, Michael Zhang, Fred Shentu, Pieter Abbeel
Robust Covariate Shift Regression [Paper]
- Xiangli Chen, Mathew Monfort, Anqi Liu, Brian D. Ziebart

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SafeRL.md

SafeRL.md

Reading List

Risk, Transparency, Explainability

Files

SafeRL.md

Latest commit

History

SafeRL.md

File metadata and controls

Reading List

Risk, Transparency, Explainability