Skip to content

Why is the soft update rate of target networks (α) set to 0.1? #59

Answered by Cryolite
Koyonomi asked this question in Q&A
Discussion options

You must be logged in to vote

Your are absolutely correct. For instance, in the original papers on CQL and IQL, the experimental settings for the target network update ratio (Polyak averaging coefficient) used 0.005, while the default value in kanachan is 0.1, which is significantly larger. This is simply an error without any particular intent, and it seems advisable to change the default value. This change will be promptly reflected in the develop branch.

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Koyonomi
Comment options

Answer selected by Koyonomi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants