Repository of reading lists on overview of AI Safety, Safe RL, and topics under the hood, along with selected paper summaries, and corresponding links
- A Comprehensive Survey on Safe Reinforcement Learning [Paper] [Summary]
- Javier Garcia, Fernando Fernandez, JMLR, 2015
- Should Robots be Obedient? [Paper]
- Smitha Milli, Dylan Hadfield-Menell, Anca Dragan, Stuart Russell
- Enabling Robots to Communicate their Objectives [Paper]
- Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan
- Safe Model-based Reinforcement Learning with Stability Guarantees [Paper]
- Felix Berkenkamp, Matteo Turchetta, Angela Schoellig, Andreas Krause, NIPS 2017
- On ensuring that machines are well behaved [Paper] [Summary]
- Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, and Emma Brunskill
- Safe Exploration in Markov Decision Processes [Paper]
- Teodor Mihai Moldovan, Pieter Abbeel, ICML, 2012
- The Off-Switch Game [Paper]
- Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, Stuart Russell
- Concrete Problems in AI Safety [Paper]
- Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, Dan Man´e
- Constrained Policy Optimization [Paper]
- Joshua Achiam, David Held, Aviv Tamar, Pieter Abbeel
- Probabilistically Safe Policy Transfer [Paper]
- David Held, Zoe McCarthy, Michael Zhang, Fred Shentu, Pieter Abbeel
- Robust Covariate Shift Regression [Paper]
- Xiangli Chen, Mathew Monfort, Anqi Liu, Brian D. Ziebart