You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2024-05-16 13:48:21,126] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 4/4 [01:31<00:00, 22.93s/it]
Some weights of the model checkpoint at work_dirs/llama-vid/llama-vid-7b-full-224-long-video-MovieLLM were not used when initializing LlavaLlamaAttForCausalLM: ['model.vision_tower.vision_tower.blocks.34.attn.v_bias', 'model.vlm_att_encoder.bert.encoder.layer.10.attention.output.LayerNorm.weight', 'model.vision_tower.vision_tower.blocks.1.norm1.weight', 'model.vision_tower.vision_tower.blocks.17.attn.q_bias', 'model.vlm_att_encoder.bert.encoder.layer.0.output.LayerNorm.bias', 'model.vision_tower.vision_tower.blocks.10.attn.proj.weight', 'model.vlm_att_encoder.bert.encoder.layer.5.output_query.LayerNorm.weight', 'model.vision_tower.vision_tower.blocks.13.attn.q_bias', 'too many data']
- This IS expected if you are initializing LlavaLlamaAttForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlavaLlamaAttForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
_IncompatibleKeys(missing_keys=[], unexpected_keys=['norm.weight', 'norm.bias', 'head.weight',......too many data']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Freezing all qformer weights...
Loading pretrained weights...
Loading vlm_att_query weights...
Loading vlm_att_ln weights...
Text with video
> Input token num: 32096
This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.
Traceback (most recent call last):
File "/root/autodl-tmp/autodl-tmp/MovieLLM-code/LLaMA-VID/llamavid/serve/run_llamavid_movie.py", line 112, in <module>
run_inference(args)
File "/root/autodl-tmp/autodl-tmp/MovieLLM-code/LLaMA-VID/llamavid/serve/run_llamavid_movie.py", line 87, in run_inference
output_ids = model.generate(
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate
return self.sample(
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/transformers/generation/utils.py", line 2642, in sample
outputs = self(
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/autodl-tmp/autodl-tmp/MovieLLM-code/LLaMA-VID/llamavid/model/language_model/llava_llama_vid.py", line 85, in forward
outputs = self.model(
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 693, in forward
layer_outputs = decoder_layer(
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 408, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/MovieLLM/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = old_forward(*args, **kwargs)
File "/root/autodl-tmp/autodl-tmp/MovieLLM-code/LLaMA-VID/llamavid/train/llama_flash_attn_monkey_patch.py", line 157, in forward_inference
v = torch.cat([past_key_value[1].transpose(1, 2), v], dim=1)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 252.00 MiB (GPU 0; 47.50 GiB total capacity; 41.09 GiB already allocated; 132.56 MiB free; 47.02 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
(MovieLLM) root@autodl-container-307c46a8f1-f6e37430:~/autodl-tmp/autodl-tmp/MovieLLM-code/LLaMA-VID#
The text was updated successfully, but these errors were encountered:
The text was updated successfully, but these errors were encountered: