Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add aot custom op to accelerate computing bbox_iou on GPU #178

Merged
merged 1 commit into from
Aug 28, 2023

Conversation

panshaowu
Copy link
Collaborator

@panshaowu panshaowu commented Jul 30, 2023

Thank you for your contribution to the MindYOLO repo.
Before submitting this PR, please make sure:

Motivation

Using MindSpore ops.Custom to accelerate computing bbox_iou on GPU platform.
(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@@ -0,0 +1,183 @@
/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyright不要加

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已删除所有copyright声明

@@ -0,0 +1,272 @@
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要提前编译吗,是不是在install readme里做下说明

Copy link
Collaborator Author

@panshaowu panshaowu Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 在__init__.py中加入检查.so文件和编译的命令,是为了方便在本地路径下调测。
  2. 已修订setup.py,执行python setup.py install或package.sh时,会自动编译.so和执行安装。
  3. 已增加单独的编译脚本,并在install readme中补充手工编译.so的说明。

@@ -81,6 +81,8 @@ def get_parser_train(parents=None):
help="ModelArts: local device path to dataset folder")
parser.add_argument("--ckpt_dir", type=str, default="/cache/pretrain_ckpt/",
help="ModelArts: local device path to checkpoint folder")
parser.add_argument("--use_fused_op", type=ast.literal_eval, default=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是做下校验,除GPU外,其他device拦截掉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已在set_default方法中增加校验,当非GPU时,抛warning并将args.use_fused置为False。

@panshaowu panshaowu force-pushed the master branch 4 times, most recently from 3abaf62 to 778d277 Compare August 25, 2023 09:21
CaitinZhao
CaitinZhao previously approved these changes Aug 26, 2023
Comment on lines 9 to 11
from .fused_op import fused_get_ciou_op_path, fused_get_ciou_op_bprop_path, fused_get_center_dist_op_path, \
fused_get_center_dist_op_bprop_path, fused_get_ciou_gpu_info, fused_get_ciou_bprop_gpu_info, \
fused_get_center_dist_gpu_info, fused_get_center_dist_bprop_gpu_info, fused_get_convex_diagonal_squared_info, \
fused_get_convex_diagonal_squared_path, fused_get_convex_diagonal_squared_grad_path, \
fused_get_boundding_boxes_coord_path, fused_get_boundding_boxes_coord_grad_path, \
fused_get_intersection_area_path, fused_get_intersection_area_grad_path, \
fused_get_convex_diagonal_squared_grad_info, fused_get_ciou_diagonal_angle_info,\
fused_get_ciou_diagonal_angle_grad_info, fused_get_ciou_diagonal_angle_grad_path, fused_get_ciou_diagonal_angle_path, \
fused_get_iou_op_path, fused_get_iou_op_bprop_path, fused_get_iou_gpu_info, fused_get_iou_bprop_gpu_info, \
fused_get_boundding_boxes_coord_gpu_info, fused_get_boundding_boxes_coord_bprop_gpu_info, \
fused_get_intersection_area_gpu_info, fused_get_intersection_area_gpu_grad_info

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果用户未编译fuse_op 这个地方的import是否会报错,是否需要加一下 try expect

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如用户未采用setup.py或build.sh编译fused_op所依赖的.so库,则在import时会自动启动编译——fused_op的__init__.py文件会在import时检查.so库是否存在:

root@mindspore:/data0/perf/git/mindyolo# rm -rf mindyolo/models/losses/fused_op/*.so
(perf) root@mindspore:/data0/perf/git/mindyolo# python train.py --config ./configs/yolov7/yolov7.yaml
/root/miniconda3/envs/perf/lib/python3.8/site-packages/scipy/__init__.py:143: UserWarning: A NumPy version >=1.19.5 and <1.27.0 is required for this version of SciPy (detected version 1.19.0)
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_center_dist_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_center_dist_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_convex_diagonal_squared_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_convex_diagonal_squared_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_iou_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_iou_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_diagonal_angle_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_ciou_diagonal_angle_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_boundding_boxes_coord_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_boundding_boxes_coord_kernel.cu
nvcc compiler cmd: nvcc --shared -Xcompiler -fPIC -o /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_intersection_area_kernel.so /data0/perf/git/mindyolo/mindyolo/models/losses/fused_op/fused_get_intersection_area_kernel.cu
2023-08-26 23:40:41,416 [INFO] parse_args:
2023-08-26 23:40:41,416 [INFO] device_target                           GPU
2023-08-26 23:40:41,416 [INFO] save_dir                                ./runs/2023.08.26-23.40.41
2023-08-26 23:40:41,416 [INFO] device_per_servers                      8
2023-08-26 23:40:41,416 [INFO] log_level                               INFO
2023-08-26 23:40:41,416 [INFO] is_parallel                             False
2023-08-26 23:40:41,416 [INFO] ms_mode                                 0
2023-08-26 23:40:41,416 [INFO] ms_amp_level                            O0
2023-08-26 23:40:41,416 [INFO] keep_loss_fp32                          True
2023-08-26 23:40:41,416 [INFO] ms_loss_scaler                          static
2023-08-26 23:40:41,416 [INFO] ms_loss_scaler_value                    1024.0
2023-08-26 23:40:41,416 [INFO] ms_grad_sens                            1024.0
2023-08-26 23:40:41,416 [INFO] ms_jit                                  True
2023-08-26 23:40:41,416 [INFO] ms_enable_graph_kernel                  False
2023-08-26 23:40:41,416 [INFO] ms_datasink                             False
2023-08-26 23:40:41,416 [INFO] overflow_still_update                   True
2023-08-26 23:40:41,416 [INFO] ema                                     True
2023-08-26 23:40:41,416 [INFO] weight
2023-08-26 23:40:41,416 [INFO] ema_weight
2023-08-26 23:40:41,416 [INFO] freeze                                  []
2023-08-26 23:40:41,416 [INFO] epochs                                  300
2023-08-26 23:40:41,416 [INFO] per_batch_size                          16
2023-08-26 23:40:41,416 [INFO] img_size                                640
2023-08-26 23:40:41,416 [INFO] nbs                                     64
2023-08-26 23:40:41,416 [INFO] accumulate                              1
2023-08-26 23:40:41,416 [INFO] auto_accumulate                         False
2023-08-26 23:40:41,416 [INFO] log_interval                            100
2023-08-26 23:40:41,416 [INFO] single_cls                              False
2023-08-26 23:40:41,416 [INFO] sync_bn                                 False
2023-08-26 23:40:41,416 [INFO] keep_checkpoint_max                     100
2023-08-26 23:40:41,416 [INFO] run_eval                                False
2023-08-26 23:40:41,416 [INFO] conf_thres                              0.001
2023-08-26 23:40:41,416 [INFO] iou_thres                               0.65
2023-08-26 23:40:41,416 [INFO] conf_free                               False
2023-08-26 23:40:41,416 [INFO] rect                                    False
2023-08-26 23:40:41,416 [INFO] nms_time_limit                          20.0
2023-08-26 23:40:41,416 [INFO] recompute                               True
2023-08-26 23:40:41,416 [INFO] recompute_layers                        5
2023-08-26 23:40:41,416 [INFO] seed                                    2
2023-08-26 23:40:41,416 [INFO] summary                                 True
2023-08-26 23:40:41,416 [INFO] profiler                                False
2023-08-26 23:40:41,416 [INFO] profiler_step_num                       1
2023-08-26 23:40:41,416 [INFO] opencv_threads_num                      2
2023-08-26 23:40:41,416 [INFO] enable_modelarts                        False
2023-08-26 23:40:41,416 [INFO] data_url
2023-08-26 23:40:41,416 [INFO] ckpt_url
2023-08-26 23:40:41,416 [INFO] multi_data_url
2023-08-26 23:40:41,416 [INFO] pretrain_url
2023-08-26 23:40:41,416 [INFO] train_url
2023-08-26 23:40:41,416 [INFO] data_dir                                /cache/data/
2023-08-26 23:40:41,416 [INFO] ckpt_dir                                /cache/pretrain_ckpt/
2023-08-26 23:40:41,416 [INFO] use_fused_op                            True
2023-08-26 23:40:41,416 [INFO] data.dataset_name                       coco
2023-08-26 23:40:41,416 [INFO] data.train_set                          /data0/perf/dataset/coco_yolo/train2017.txt
2023-08-26 23:40:41,416 [INFO] data.val_set                            /data0/perf/dataset/coco_yolo/val2017.txt
2023-08-26 23:40:41,416 [INFO] data.test_set                           /data0/perf/dataset/coco_yolo/test-dev2017.txt
2023-08-26 23:40:41,416 [INFO] data.nc                                 80

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按照沟通意见,已移除__init__.py的自动编译功能,改为在启动校验环节提示用户执行编译脚本。

@CaitinZhao CaitinZhao merged commit 662c5dc into mindspore-lab:master Aug 28, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants