Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requirement.txt problem #5

Open
Elon-Lau opened this issue Jul 25, 2024 · 3 comments
Open

requirement.txt problem #5

Elon-Lau opened this issue Jul 25, 2024 · 3 comments

Comments

@Elon-Lau
Copy link

Hello, Dr. Yang! I encountered the following error using the configuration.txt you gave. What is the cause of this?
[rank0]: Traceback (most recent call last):
[rank0]: File "/root/RiC-main/sft/sft.py", line 83, in
[rank0]: model = AutoModelForCausalLM.from_pretrained(
[rank0]: File "/data/anaconda3/envs/ric/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
[rank0]: return model_class.from_pretrained(
[rank0]: File "/data/anaconda3/envs/ric/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3916, in from_pretrained
[rank0]: ) = cls._load_pretrained_model(
[rank0]: File "/data/anaconda3/envs/ric/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4390, in _load_pretrained_model
[rank0]: new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
[rank0]: File "/data/anaconda3/envs/ric/lib/python3.10/site-packages/transformers/modeling_utils.py", line 945, in _load_state_dict_into_meta_model
[rank0]: value = type(value)(value.data.to("cpu"), **value.dict)
[rank0]: File "/data/anaconda3/envs/ric/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 491, in new
[rank0]: obj = torch.Tensor._make_subclass(cls, data, requires_grad)
[rank0]: RuntimeError: Only Tensors of floating point and complex dtype can require gradients

@YangRui2015
Copy link
Owner

Hi, I have verified that I can run the configuration file successfully. Could you please provide more details on how you are executing the sft.py file and the package versions of accelerate, bitsandbytes, transformers, and peft?

Based on a related issue (bitsandbytes-foundation/bitsandbytes#1232), it seems that the problem might be due to an outdated version of bitsandbytes. You may need to update it.

@Elon-Lau
Copy link
Author

Elon-Lau commented Jul 26, 2024

Hi, Thank you for your answer! I ran into the same problem when running RiC and SFT, here are my commands.

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch main.py --train_dataset_path './datasets/train_harmhelp.hf' --exp_type 'assistant' --reward_names 'harmless,helpful' --training_steps 20000 --num_online_iterations 0 --wandb_name 'ric_assistant_harmlesshelpful_offline20000' --batch_size 2 --load_in_8bit True

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch sft.py --base_model_name 'meta-llama/Llama-2-7b-hf' --exp_type 'summary'

The version of accelerate, bitsandbytes ,transformers, peft, trl, torch, CUDA are 0.32.1, 0.43.2, 4.40.0, 0.11.1, 0.9.4, 2.3.1, 12.0, respectively. In addition, I'm confused about --wandb_name {name_of_the_experiment}. Is it in the format helpful_assistant and reddit_summary?

@YangRui2015
Copy link
Owner

I cannot reproduce your issue with the configuration.

transformers             4.40.0
trl                      0.9.4
peft                     0.11.1
accelerate               0.32.1
bitsandbytes             0.43.1
deepspeed                0.14.4
torch                    2.3.1
CUDA                  cuda_12.1

I can run this code successfully. Please first check whether you can run the sft example from trl https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py.
截屏2024-07-27 02 30 16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants