Code for Sam Hodges during the 2021 REU Internship Project. The IMPALA code is editted code from https://github.com/deepmind/dm-haiku/tree/main/examples/impala, and was used to test various off-policy actor-critic algorithms within the IMPALA structure.
The MDP code is the beginning of next steps for the off-policy actor-critic testing: since it's difficult to control when IMPALA is off-policy, the hope was an MDP would allow for greater control, and therefore also greater clarity. Unfortunately, this code is still a work in progress.