Skip to content

Latest commit

 

History

History
101 lines (63 loc) · 4.68 KB

README.md

File metadata and controls

101 lines (63 loc) · 4.68 KB

HLRL

HLRL is High Level Reinforcement Learning, a library that implements many state of the art algorithms, and makes implementing your own a breeze. There is support for any generic backend library.


Contents


Installation

Installation is done by cloning the git repository and installing using setup.py.

git clone https://github.com/Chainso/HLRL
cd HLRL
pip install .

Code Structure

hlrl.core contains common modules that are agnostic to any particular framework. hlrl.torch is for modules that are implemented using the PyTorch backend.

Agents

hlrl.*.agents packages contain agents that interact with the environment and train models.

Algorithms

hlrl.*.algos contains the logic for the inference and training of reinforcement learning algorithms.

Experience Replay

hlrl.*.experience_replay are the storage components for off-policy algorithms.

Distributed

hlrl.*.distributed holds architecture for distributed training of algorithms.

Environments

hlrl.core.envs contains the base environment and wrappers for common environment types.

Loggers

hlrl.core.logger contains loggers for algorithms and agents.

Policies

The hlrl.torch.policies package contains multi-layer generalizations of single layers and common networks such as Gaussians. This is used to quickly spin up a model without needing to subclass nn.Module yourself.


Concepts

Wrappers

The base wrapper is implemented in hlrl.core.common.wrappers. Wrappers are used to add additional functionality to existing classes, or to change existing functionality. Functionally, wrapping a class creates prototypal inheritance, allowing for wrappers to work on any class. This creates a very flexible container that allows you to swap out and modifiy algorithms and agents by simply wrapping it with your desired class.

Experiences

Experiences are passed between modules as a dictionaries. This allows you to add to additional values to experiences without affecting old functionality. Combined with wrappers, you can create more functionality on top of base algorithms.


Examples

Examples are in the examples directory. They take command line arguments to configure the algorithm and will log results using TensorBoard.

Implemented Examples

Flexible algorithms can be used with any base algorithm that supports it. Wrappers can be used with any algorithm and in combination with any number of wrappers.

Algorithm Flexible Wrapper Recurrent Description
SAC N/A ✔️ SAC auto temperature tuning and optional twin Q-networks, recurrent with R2D2
DQN N/A ✔️ DQN with Rainbow features excluding noisy networks, dueling architecture and C51, recurrent with R2D2
IQN N/A ✔️ IQN with Rainbow features excluding noisy networks, recurrent R2D2
RND ✔️ ✔️ N/A RND excluding state normalization
MunchausenRL ✔️ ✔️ N/A MunchausenRL as seen in the literature
Ape-X ✔️ N/A Ape-X for multi-core machines with a single model shared across agents
R2D2 ✔️ N/A R2D2 with hidden state storing and burning in