Codes accompanying the paper "Score Regularized Policy Optimization through Diffusion Behavior" (ICLR 2024).
reinforcement-learning
offline
rl
generative
diffusion
score-based-models
d4rl
srpo
behavior-regularization
-
Updated
Feb 10, 2024 - Python