-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add aot custom op to accelerate computing bbox_iou on GPU #178
Conversation
@@ -0,0 +1,183 @@ | |||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyright不要加
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除所有copyright声明
@@ -0,0 +1,272 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要提前编译吗,是不是在install readme里做下说明
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 在__init__.py中加入检查.so文件和编译的命令,是为了方便在本地路径下调测。
- 已修订setup.py,执行python setup.py install或package.sh时,会自动编译.so和执行安装。
- 已增加单独的编译脚本,并在install readme中补充手工编译.so的说明。
@@ -81,6 +81,8 @@ def get_parser_train(parents=None): | |||
help="ModelArts: local device path to dataset folder") | |||
parser.add_argument("--ckpt_dir", type=str, default="/cache/pretrain_ckpt/", | |||
help="ModelArts: local device path to checkpoint folder") | |||
parser.add_argument("--use_fused_op", type=ast.literal_eval, default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是做下校验,除GPU外,其他device拦截掉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已在set_default方法中增加校验,当非GPU时,抛warning并将args.use_fused置为False。
3abaf62
to
778d277
Compare
mindyolo/models/losses/iou_loss.py
Outdated
from .fused_op import fused_get_ciou_op_path, fused_get_ciou_op_bprop_path, fused_get_center_dist_op_path, \ | ||
fused_get_center_dist_op_bprop_path, fused_get_ciou_gpu_info, fused_get_ciou_bprop_gpu_info, \ | ||
fused_get_center_dist_gpu_info, fused_get_center_dist_bprop_gpu_info, fused_get_convex_diagonal_squared_info, \ | ||
fused_get_convex_diagonal_squared_path, fused_get_convex_diagonal_squared_grad_path, \ | ||
fused_get_boundding_boxes_coord_path, fused_get_boundding_boxes_coord_grad_path, \ | ||
fused_get_intersection_area_path, fused_get_intersection_area_grad_path, \ | ||
fused_get_convex_diagonal_squared_grad_info, fused_get_ciou_diagonal_angle_info,\ | ||
fused_get_ciou_diagonal_angle_grad_info, fused_get_ciou_diagonal_angle_grad_path, fused_get_ciou_diagonal_angle_path, \ | ||
fused_get_iou_op_path, fused_get_iou_op_bprop_path, fused_get_iou_gpu_info, fused_get_iou_bprop_gpu_info, \ | ||
fused_get_boundding_boxes_coord_gpu_info, fused_get_boundding_boxes_coord_bprop_gpu_info, \ | ||
fused_get_intersection_area_gpu_info, fused_get_intersection_area_gpu_grad_info | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果用户未编译fuse_op 这个地方的import是否会报错,是否需要加一下 try expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如用户未采用setup.py或build.sh编译fused_op所依赖的.so库,则在import时会自动启动编译——fused_op的__init__.py文件会在import时检查.so库是否存在:
root@mindspore:/data0/perf/git/mindyolo# rm -rf mindyolo/models/losses/fused_op/*.so
(perf) root@mindspore:/data0/perf/git/mindyolo# python train.py --config ./configs/yolov7/yolov7.yaml
/root/miniconda3/envs/perf/lib/python3.8/site-packages/scipy/__init__.py:143: UserWarning: A NumPy version >=1.19.5 and <1.27.0 is required for this version of SciPy (detected version 1.19.0)
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_center_dist_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_center_dist_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_convex_diagonal_squared_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_convex_diagonal_squared_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_iou_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_iou_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_diagonal_angle_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_diagonal_angle_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_boundding_boxes_coord_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_boundding_boxes_coord_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_intersection_area_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_intersection_area_kernel.cu
2023-08-26 23:40:41,416 [INFO] parse_args:
2023-08-26 23:40:41,416 [INFO] device_target GPU
2023-08-26 23:40:41,416 [INFO] save_dir ./runs/2023.08.26-23.40.41
2023-08-26 23:40:41,416 [INFO] device_per_servers 8
2023-08-26 23:40:41,416 [INFO] log_level INFO
2023-08-26 23:40:41,416 [INFO] is_parallel False
2023-08-26 23:40:41,416 [INFO] ms_mode 0
2023-08-26 23:40:41,416 [INFO] ms_amp_level O0
2023-08-26 23:40:41,416 [INFO] keep_loss_fp32 True
2023-08-26 23:40:41,416 [INFO] ms_loss_scaler static
2023-08-26 23:40:41,416 [INFO] ms_loss_scaler_value 1024.0
2023-08-26 23:40:41,416 [INFO] ms_grad_sens 1024.0
2023-08-26 23:40:41,416 [INFO] ms_jit True
2023-08-26 23:40:41,416 [INFO] ms_enable_graph_kernel False
2023-08-26 23:40:41,416 [INFO] ms_datasink False
2023-08-26 23:40:41,416 [INFO] overflow_still_update True
2023-08-26 23:40:41,416 [INFO] ema True
2023-08-26 23:40:41,416 [INFO] weight
2023-08-26 23:40:41,416 [INFO] ema_weight
2023-08-26 23:40:41,416 [INFO] freeze []
2023-08-26 23:40:41,416 [INFO] epochs 300
2023-08-26 23:40:41,416 [INFO] per_batch_size 16
2023-08-26 23:40:41,416 [INFO] img_size 640
2023-08-26 23:40:41,416 [INFO] nbs 64
2023-08-26 23:40:41,416 [INFO] accumulate 1
2023-08-26 23:40:41,416 [INFO] auto_accumulate False
2023-08-26 23:40:41,416 [INFO] log_interval 100
2023-08-26 23:40:41,416 [INFO] single_cls False
2023-08-26 23:40:41,416 [INFO] sync_bn False
2023-08-26 23:40:41,416 [INFO] keep_checkpoint_max 100
2023-08-26 23:40:41,416 [INFO] run_eval False
2023-08-26 23:40:41,416 [INFO] conf_thres 0.001
2023-08-26 23:40:41,416 [INFO] iou_thres 0.65
2023-08-26 23:40:41,416 [INFO] conf_free False
2023-08-26 23:40:41,416 [INFO] rect False
2023-08-26 23:40:41,416 [INFO] nms_time_limit 20.0
2023-08-26 23:40:41,416 [INFO] recompute True
2023-08-26 23:40:41,416 [INFO] recompute_layers 5
2023-08-26 23:40:41,416 [INFO] seed 2
2023-08-26 23:40:41,416 [INFO] summary True
2023-08-26 23:40:41,416 [INFO] profiler False
2023-08-26 23:40:41,416 [INFO] profiler_step_num 1
2023-08-26 23:40:41,416 [INFO] opencv_threads_num 2
2023-08-26 23:40:41,416 [INFO] enable_modelarts False
2023-08-26 23:40:41,416 [INFO] data_url
2023-08-26 23:40:41,416 [INFO] ckpt_url
2023-08-26 23:40:41,416 [INFO] multi_data_url
2023-08-26 23:40:41,416 [INFO] pretrain_url
2023-08-26 23:40:41,416 [INFO] train_url
2023-08-26 23:40:41,416 [INFO] data_dir /cache/data/
2023-08-26 23:40:41,416 [INFO] ckpt_dir /cache/pretrain_ckpt/
2023-08-26 23:40:41,416 [INFO] use_fused_op True
2023-08-26 23:40:41,416 [INFO] data.dataset_name coco
2023-08-26 23:40:41,416 [INFO] data.train_set /data0/perf/dataset/coco_yolo/train2017.txt
2023-08-26 23:40:41,416 [INFO] data.val_set /data0/perf/dataset/coco_yolo/val2017.txt
2023-08-26 23:40:41,416 [INFO] data.test_set /data0/perf/dataset/coco_yolo/test-dev2017.txt
2023-08-26 23:40:41,416 [INFO] data.nc 80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
按照沟通意见,已移除__init__.py的自动编译功能,改为在启动校验环节提示用户执行编译脚本。
f72c5a8
to
99e7709
Compare
Thank you for your contribution to the MindYOLO repo.
Before submitting this PR, please make sure:
Motivation
Using MindSpore ops.Custom to accelerate computing bbox_iou on GPU platform.
(Write your motivation for proposed changes here.)
Test Plan
(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)
Related Issues and PRs
(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)