We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run-all.sh multi gpu 실행 시 일부 task(dependency parsing)가 정상적으로 작동하지 않습니다.
error-message:
RuntimeError: The size of tensor a (23) must match the size of tensor b (25) at non-singleton dimension 2
[python==3.7.11]
git clone --recursive https://github.com/KLUE-benchmark/KLUE-Baseline.git pip install -r requirements.txt pip install torch==1.7.0+cu110 -f https://download.pytorch.org/whl/torch_stable.html (cuda version matching with torch)
run-all.sh 수정: KLUE-DP task="klue-dp"
python run_klue.py train --task ${task} --output_dir ${OUTPUT_DIR} --data_dir ${DATA_DIR}/${task}-${VERSION} --model_name_or_path klue/roberta-large --learning_rate 5e-5 --num_train_epochs 15 --gradient_accumulation_steps 1 --warmup_ratio 0.2 --train_batch_size 32 --patience 10000 --max_seq_length 256 --metric_key uas_macro_f1 --gpus 0 --num_workers 4
->
python run_klue.py train --task ${task} --output_dir ${OUTPUT_DIR} --data_dir ${DATA_DIR}/${task}-${VERSION} --model_name_or_path klue/roberta-large --learning_rate 3e-5 --num_train_epochs 10 --train_batch_size 16 --eval_batch_size 16 --max_seq_length 510 --gradient_accumulation_steps 2 --warmup_ratio 0.2 --weight_decay 0.01 --max_grad_norm 1.0 --patience 100000 --metric_key slot_micro_f1 --gpus 1 2 3 --num_workers 8
bash run-all.sh
single GPU에선 메모리 부족으로 roBERTa-Large 모델로 학습이 불가하여 혹시 도움 받을 수 있을까 싶어 문의드립니다!
감사합니다.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Abstract(요약) 🔥
run-all.sh multi gpu 실행 시 일부 task(dependency parsing)가 정상적으로 작동하지 않습니다.
error-message:
RuntimeError: The size of tensor a (23) must match the size of tensor b (25) at non-singleton dimension 2
How to Reproduce(재현 방법) 🤔
[python==3.7.11]
git clone --recursive https://github.com/KLUE-benchmark/KLUE-Baseline.git
pip install -r requirements.txt
pip install torch==1.7.0+cu110 -f https://download.pytorch.org/whl/torch_stable.html (cuda version matching with torch)
run-all.sh 수정:
KLUE-DP
task="klue-dp"
python run_klue.py train --task ${task} --output_dir ${OUTPUT_DIR} --data_dir ${DATA_DIR}/${task}-${VERSION} --model_name_or_path klue/roberta-large --learning_rate 5e-5 --num_train_epochs 15 --gradient_accumulation_steps 1 --warmup_ratio 0.2 --train_batch_size 32 --patience 10000 --max_seq_length 256 --metric_key uas_macro_f1 --gpus 0 --num_workers 4
->
python run_klue.py train --task ${task} --output_dir ${OUTPUT_DIR} --data_dir ${DATA_DIR}/${task}-${VERSION} --model_name_or_path klue/roberta-large --learning_rate 3e-5 --num_train_epochs 10 --train_batch_size 16 --eval_batch_size 16 --max_seq_length 510 --gradient_accumulation_steps 2 --warmup_ratio 0.2 --weight_decay 0.01 --max_grad_norm 1.0 --patience 100000 --metric_key slot_micro_f1 --gpus 1 2 3 --num_workers 8
bash run-all.sh
RuntimeError: The size of tensor a (23) must match the size of tensor b (25) at non-singleton dimension 2
How to solve (어떻게 해결할 수 있을까요) 🙋♀
single GPU에선 메모리 부족으로 roBERTa-Large 모델로 학습이 불가하여
혹시 도움 받을 수 있을까 싶어 문의드립니다!
감사합니다.
The text was updated successfully, but these errors were encountered: