-
In other open-source projects, this value is usually set to a number within the range of [0.001, 0.01], so 0.1 seems significantly larger. In Kanachan, it is set to 0.1. Is it because smaller values make training very difficult, or is it solely for the purpose of accelerating convergence? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Your are absolutely correct. For instance, in the original papers on CQL and IQL, the experimental settings for the target network update ratio (Polyak averaging coefficient) used 0.005, while the default value in kanachan is 0.1, which is significantly larger. This is simply an error without any particular intent, and it seems advisable to change the default value. This change will be promptly reflected in the |
Beta Was this translation helpful? Give feedback.
Your are absolutely correct. For instance, in the original papers on CQL and IQL, the experimental settings for the target network update ratio (Polyak averaging coefficient) used 0.005, while the default value in kanachan is 0.1, which is significantly larger. This is simply an error without any particular intent, and it seems advisable to change the default value. This change will be promptly reflected in the
develop
branch.