HLRL is High Level Reinforcement Learning, a library that implements many state of the art algorithms, and makes implementing your own a breeze. There is support for any generic backend library.
Installation is done by cloning the git repository and installing using setup.py
.
git clone https://github.com/Chainso/HLRL
cd HLRL
pip install .
hlrl.core
contains common modules that are agnostic to any particular framework. hlrl.torch
is for modules that are implemented using the PyTorch backend.
hlrl.*.agents
packages contain agents that interact with the environment and train models.
hlrl.*.algos
contains the logic for the inference and training of reinforcement learning algorithms.
hlrl.*.experience_replay
are the storage components for off-policy algorithms.
hlrl.*.distributed
holds architecture for distributed training of algorithms.
hlrl.core.envs
contains the base environment and wrappers for common environment types.
hlrl.core.logger
contains loggers for algorithms and agents.
The hlrl.torch.policies
package contains multi-layer generalizations of single layers and common networks such as Gaussians. This is used to quickly spin up a model without needing to subclass nn.Module
yourself.
The base wrapper is implemented in hlrl.core.common.wrappers
. Wrappers are used to add additional functionality to existing classes, or to change existing functionality. Functionally, wrapping a class creates prototypal inheritance, allowing for wrappers to work on any class. This creates a very flexible container that allows you to swap out and modifiy algorithms and agents by simply wrapping it with your desired class.
Experiences are passed between modules as a dictionaries. This allows you to add to additional values to experiences without affecting old functionality. Combined with wrappers, you can create more functionality on top of base algorithms.
Examples are in the examples
directory. They take command line arguments to configure the algorithm and will log results using TensorBoard.
Flexible algorithms can be used with any base algorithm that supports it. Wrappers can be used with any algorithm and in combination with any number of wrappers.
Algorithm | Flexible | Wrapper | Recurrent | Description |
---|---|---|---|---|
SAC | ❌ | N/A | ✔️ | SAC auto temperature tuning and optional twin Q-networks, recurrent with R2D2 |
DQN | ❌ | N/A | ✔️ | DQN with Rainbow features excluding noisy networks, dueling architecture and C51, recurrent with R2D2 |
IQN | ❌ | N/A | ✔️ | IQN with Rainbow features excluding noisy networks, recurrent R2D2 |
RND | ✔️ | ✔️ | N/A | RND excluding state normalization |
MunchausenRL | ✔️ | ✔️ | N/A | MunchausenRL as seen in the literature |
Ape-X | ✔️ | ❌ | N/A | Ape-X for multi-core machines with a single model shared across agents |
R2D2 | ✔️ | ❌ | N/A | R2D2 with hidden state storing and burning in |