Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatGLM2-6b模型合并时报错ValueError: We need an offload_dir to dispatch this model according to this device_map, the following submodules need to be offloaded #50

Open
Hysy11 opened this issue Apr 3, 2024 · 0 comments

Comments

@Hysy11
Copy link

Hysy11 commented Apr 3, 2024

(venv) PS C:\MyFiles\AI\model\chatglm2> python merge_lora_and_quantize.py --lora_path saved_files/chatGLM_6B_QLoRA_t32 --output_path /tmp/merged_qlora_model_4bit --remote_scripts_dir remote_scripts/chatglm2-6b --qbits 4
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:07<00:00,  1.10s/it]
WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu.
Traceback (most recent call last):
  File "C:\MyFiles\AI\model\chatglm2\merge_lora_and_quantize.py", line 80, in <module>
    main(lora_path=args.lora_path,
  File "C:\MyFiles\AI\model\chatglm2\merge_lora_and_quantize.py", line 54, in main
    merged_model, lora_config = merge_lora(lora_path, device_map)
  File "C:\MyFiles\AI\model\chatglm2\merge_lora_and_quantize.py", line 28, in merge_lora
    model = PeftModel.from_pretrained(base_model, lora_path, device_map=device_map)
  File "C:\MyFiles\AI\model\chatglm2\venv\lib\site-packages\peft\peft_model.py", line 181, in from_pretrained
    model.load_adapter(model_id, adapter_name, **kwargs)
  File "C:\MyFiles\AI\model\chatglm2\venv\lib\site-packages\peft\peft_model.py", line 406, in load_adapter
    dispatch_model(
  File "C:\MyFiles\AI\model\chatglm2\venv\lib\site-packages\accelerate\big_modeling.py", line 374, in dispatch_model
    raise ValueError(
ValueError: We need an `offload_dir` to dispatch this model according to this `device_map`, the following submodules need to be offloaded: base_model.model.transformer.encoder.layers.12, base_model.model.transformer.encoder.layers.13, base_model.model.transformer.encoder.layers.14, base_model.model.transformer.encoder.layers.15, base_model.model.transformer.encoder.layers.16, base_model.model.transformer.encoder.layers.17, base_model.model.transformer.encoder.layers.18, base_model.model.transformer.encoder.layers.19, base_model.model.transformer.encoder.layers.20, base_model.model.transformer.encoder.layers.21, base_model.model.transformer.encoder.layers.22, base_model.model.transformer.encoder.layers.23, base_model.model.transformer.encoder.layers.24, base_model.model.transformer.encoder.layers.25, base_model.model.transformer.encoder.layers.26, base_model.model.transformer.encoder.layers.27, base_model.model.transformer.encoder.final_layernorm, base_model.model.transformer.output_layer.
(venv) PS C:\MyFiles\AI\model\chatglm2> 

初始模型

chatglm2-6b

GPU内存

16G

部分依赖项版本:

accelerate==0.28.0
transformers==4.38.2
peft==0.3.0

我尝试了

  1. 更换依赖项版本
    结果没有变化

  2. 在merge_lora_and_quantize.py文件中加入offload_dir参数
    image

    结果产生了如下报错

(venv) PS C:\MyFiles\AI\model\chatglm2> python merge_lora_and_quantize.py --lora_path saved_files/chatGLM_6B_QLoRA_t32 --output_path /tmp/merged_qlora_model_4bit --remote_scripts_dir remote_scripts/chatglm2-6b --qbits 4
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:07<00:00,  1.07s/it]
WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu.
WARNING:root:Some parameters are on the meta device device because they were offloaded to the disk and cpu.
Traceback (most recent call last):
  File "C:\MyFiles\AI\model\chatglm2\merge_lora_and_quantize.py", line 80, in <module>
    main(lora_path=args.lora_path,
  File "C:\MyFiles\AI\model\chatglm2\merge_lora_and_quantize.py", line 56, in main
    quantized_model = quantize(merged_model, qbits)
  File "C:\MyFiles\AI\model\chatglm2\merge_lora_and_quantize.py", line 35, in quantize
    qmodel = model.quantize(qbits).half().cuda()
  File "C:\Users\71977\.cache\huggingface\modules\transformers_modules\chatglm2-6b\modeling_chatglm.py", line 1197, in quantize
    self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
  File "C:\Users\71977\.cache\huggingface\modules\transformers_modules\chatglm2-6b\quantization.py", line 157, in quantize
    weight=layer.self_attention.query_key_value.weight.to(torch.cuda.current_device()),
NotImplementedError: Cannot copy out of meta tensor; no data!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant