-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
跑最后一步报这个警告,要怎么改超参数呢 #2
Comments
你前面执行都是ok的 对吧。 |
最后一步的代码里lora_config需要加上target_modules,trl作者给出的设置是target_modules=["q_proj","k_proj"]。训练可以跑通,但是会出现kl散度为负数的情况:
尝试将generation_kwargs里的eos_token_id设置为-1,以及训练时生成回答前让ppo_trainer.model.eval()(PPO step时改回train),都无法解决问题。 |
你好 ,想问一下你的python环境的包的版本是多少: |
/root/miniconda3/envs/Vicuna/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py:1088: UserWarning: KL divergence is starting to become negative: -0.00 - this might be a precursor for failed training. sometimes this happens because the generation kwargs are not correctly set. Please make sure that the generation kwargs are set correctly, or review your training hyperparameters
The text was updated successfully, but these errors were encountered: