FlexRL

FlexRL is a deep online/offline reinforcement learning library inspired and adapted from CleanRL and CORL that provides single-file implementations of algorithms that aren't necessarily covered by these libraries. FlexRL introduces the following features:

Consistent style across online and offline algorithms
Easy configuration with Pyrallis and tqdm progress bar
A few custom environments under gym API

Quick Start

Installing FlexRL

git clone https://github.com/alexchen-buaa/flexrl.git
cd flexrl
pip install -e .

Usage

Run the algorithms as individual scripts. Like CORL, we use Pyrallis for configuration management. The arguments can be specified using command-line arguments, a yaml file, or both:

python ppo.py --config_path=some_config.yaml

Algorithms Implemented

Type	Algorithm	Variants Implemented
Online	Proximal Policy Optimization (PPO)	ppo.py
		ppo_atari.py
		ppo_multidiscrete.py
	Deep Q-Networks (DQN)	dqn.py
		dqn_atari.py
	Quantile-Regression DQN (QR-DQN)	qr_dqn.py
		qr_dqn_atari.py
	Soft Actor-Critic (SAC)	sac.py
Offline	Implicit Q-Learning (IQL)	iql.py
		iql_jax.py
	In-Sample Actor-Critic (InAC)	inac.py
		inac_jax.py
	Soft Actor-Critic Ensemble (SAC-N)	sac_n_jax.py

Extra Requirements

Atari/ALE

According to The Arcade Learning Environment, you can use the command line tool to import your ROMS:

ale-import-roms roms/

MuJoCo

To use MuJoCo envs (for both online training and offline evaluation), you need to install MuJoCo first. See mujoco-py for instructions.

JAX with CUDA Support

To use JAX with CUDA support, you need to install the NVIDIA driver first. See JAX Installation for instructions.

References

[1] S. Huang, R. F. J. Dossa, C. Ye, and J. Braga, “CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms.” arXiv, Nov. 16, 2021. Accessed: Nov. 21, 2022. [Online]. Available: http://arxiv.org/abs/2111.08819
[2] Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann, “Stable-Baselines3: Reliable Reinforcement Learning Implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021.
[3] W. Dabney, M. Rowland, M. G. Bellemare, and R. Munos, “Distributional Reinforcement Learning with Quantile Regression,” arXiv:1710.10044 [cs, stat], Oct. 2017, Accessed: Apr. 15, 2022. [Online]. Available: http://arxiv.org/abs/1710.10044
[4] I. Kostrikov, A. Nair, and S. Levine, “Offline Reinforcement Learning with Implicit Q-Learning.” arXiv, Oct. 12, 2021. Accessed: Mar. 29, 2023. [Online]. Available: http://arxiv.org/abs/2110.06169
[5] C. Xiao, H. Wang, Y. Pan, A. White, and M. White, “The In-Sample Softmax for Offline Reinforcement Learning.” arXiv, Feb. 28, 2023. Accessed: Apr. 02, 2023. [Online]. Available: http://arxiv.org/abs/2302.14372

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
src/flexrl		src/flexrl
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlexRL

Quick Start

Installing FlexRL

Usage

Algorithms Implemented

Extra Requirements

Atari/ALE

MuJoCo

JAX with CUDA Support

References

About

Releases

Packages

Languages

License

alexchen-buaa/flexrl

Folders and files

Latest commit

History

Repository files navigation

FlexRL

Quick Start

Installing FlexRL

Usage

Algorithms Implemented

Extra Requirements

Atari/ALE

MuJoCo

JAX with CUDA Support

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages