Skip to content

Implementing RL algorithms for personal development

Notifications You must be signed in to change notification settings

dxyang/pytorch-RL

Repository files navigation

Reinforcement Learning Models in PyTorch

Description

This repo contans PyTorch implementations of reinforcement learning models for personal skill development.

Background

REINFORCE is a policy gradient method that calculates the policy gradient at the end of every episode and updates the agents parameters accordingly.

Deep Deterministic Policy Gradient (DDPG) is an off-policy, actor-critic policy gradient method. Similar to Deep Q-Learning, a target and current model are used for the actor (policy) and critic (value) functions, and the target model is gradually updated. DDPG also utilizes an experience replay buffer. Losses are computed from the temporal difference error signal.

Dependencies

Results

DDPG

Reward per episode on HalfCheetah-v1

Visualization of learned policy on HalfCheetah-v1

Useful References

About

Implementing RL algorithms for personal development

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages