v1.2 Actor Critic

Latest

Latest

Ipsedo released this 21 Nov 21:12

· 58 commits to actor-critic since this release

use actor-critic as RL framework
fix reward that is always negative resulting in wrong convergence with log probabilities

Benchmarks will come soon, don't use trained models of this release

Assets 2