MindOCR supports the inference of third-party models (PaddleOCR, MMOCR, etc.), and this document displays a list of adapted models. The performance test is based on Ascend310P, and some models have no test data set yet.
name | model | backbone | dataset | F-score(%) | FPS | source | config | download | reference |
---|---|---|---|---|---|---|---|---|---|
ch_pp_det_OCRv4 | DBNet | MobileNetV3 | / | / | / | PaddleOCR | yaml | infer model | ch_PP-OCRv4_det |
ch_pp_server_det_v2.0 | DBNet | ResNet18_vd | MLT17 | 46.22 | 21.65 | PaddleOCR | yaml | infer model | ch_ppocr_server_v2.0_det |
ch_pp_det_OCRv3 | DBNet | MobileNetV3 | MLT17 | 33.89 | 22.40 | PaddleOCR | yaml | infer model | ch_PP-OCRv3_det |
ch_pp_det_OCRv2 | DBNet | MobileNetV3 | MLT17 | 42.99 | 21.90 | PaddleOCR | yaml | infer model | ch_PP-OCRv2_det |
ch_pp_mobile_det_v2.0_slim | DBNet | MobileNetV3 | MLT17 | 31.66 | 19.88 | PaddleOCR | yaml | infer model | ch_ppocr_mobile_slim_v2.0_det |
ch_pp_mobile_det_v2.0 | DBNet | MobileNetV3 | MLT17 | 31.56 | 21.96 | PaddleOCR | yaml | infer model | ch_ppocr_mobile_v2.0_det |
en_pp_det_OCRv3 | DBNet | MobileNetV3 | IC15 | 42.14 | 55.55 | PaddleOCR | yaml | infer model | en_PP-OCRv3_det |
ml_pp_det_OCRv3 | DBNet | MobileNetV3 | MLT17 | 66.01 | 22.48 | PaddleOCR | yaml | infer model | ml_PP-OCRv3_det |
en_pp_det_dbnet_resnet50vd | DBNet | ResNet50_vd | IC15 | 79.89 | 21.17 | PaddleOCR | yaml | infer model | DBNet |
en_pp_det_psenet_resnet50vd | PSE | ResNet50_vd | IC15 | 80.44 | 7.75 | PaddleOCR | yaml | train model | PSE |
en_pp_det_east_resnet50vd | EAST | ResNet50_vd | IC15 | 85.58 | 20.70 | PaddleOCR | yaml | train model | EAST |
en_pp_det_sast_resnet50vd | SAST | ResNet50_vd | IC15 | 81.77 | 22.14 | PaddleOCR | yaml | train model | SAST |
en_mm_det_dbnetpp_resnet50 | DBNet++ | ResNet50 | IC15 | 81.36 | 10.66 | MMOCR | yaml | train model | DBNetpp |
en_mm_det_fcenet_resnet50 | FCENet | ResNet50 | IC15 | 83.67 | 3.34 | MMOCR | yaml | train model | FCENet |
Notice: When using the en_pp_det_psenet_resnet50vd model for inference, you need to modify the onnx file with the following command
python deploy/models_utils/onnx_optim/insert_pse_postprocess.py \
--model_path=./pse_r50vd.onnx \
--binary_thresh=0.0 \
--scale=1.0
name | model | backbone | dataset | Acc(%) | FPS | source | dict file | config | download | reference |
---|---|---|---|---|---|---|---|---|---|---|
ch_pp_rec_OCRv4 | CRNN | MobileNetV1Enhance | / | / | / | PaddleOCR | ppocr_keys_v1.txt | yaml | infer model | ch_PP-OCRv4_rec |
ch_pp_server_rec_v2.0 | CRNN | ResNet34 | MLT17 (ch) | 49.91 | 154.16 | PaddleOCR | ppocr_keys_v1.txt | yaml | infer model | ch_ppocr_server_v2.0_rec |
ch_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | MLT17 (ch) | 49.91 | 408.38 | PaddleOCR | ppocr_keys_v1.txt | yaml | infer model | ch_PP-OCRv3_rec |
ch_pp_rec_OCRv2 | CRNN | MobileNetV1Enhance | MLT17 (ch) | 44.59 | 203.34 | PaddleOCR | ppocr_keys_v1.txt | yaml | infer model | ch_PP-OCRv2_rec |
ch_pp_mobile_rec_v2.0 | CRNN | MobileNetV3 | MLT17 (ch) | 24.59 | 167.67 | PaddleOCR | ppocr_keys_v1.txt | yaml | infer model | ch_ppocr_mobile_v2.0_rec |
en_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | MLT17 (en) | 79.79 | 917.01 | PaddleOCR | en_dict.txt | yaml | infer model | en_PP-OCRv3_rec |
en_pp_mobile_rec_number_v2.0_slim | CRNN | MobileNetV3 | / | / | / | PaddleOCR | en_dict.txt | yaml | infer model | en_number_mobile_slim_v2.0_rec |
en_pp_mobile_rec_number_v2.0 | CRNN | MobileNetV3 | / | / | / | PaddleOCR | en_dict.txt | yaml | infer model | en_number_mobile_v2.0_rec |
korean_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | korean_dict.txt | yaml | infer model | korean_PP-OCRv3_rec |
japan_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | japan_dict.txt | yaml | infer model | japan_PP-OCRv3_rec |
chinese_cht_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | chinese_cht_dict.txt | yaml | infer model | chinese_cht_PP-OCRv3_rec |
te_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | te_dict.txt | yaml | infer model | te_PP-OCRv3_rec |
ka_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | ka_dict.txt | yaml | infer model | ka_PP-OCRv3_rec |
ta_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | ta_dict.txt | yaml | infer model | ta_PP-OCRv3_rec |
latin_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | latin_dict.txt | yaml | infer model | latin_PP-OCRv3_rec |
arabic_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | arabic_dict.txt | yaml | infer model | arabic_PP-OCRv3_rec |
cyrillic_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | cyrillic_dict.txt | yaml | infer model | cyrillic_PP-OCRv3_rec |
devanagari_pp_rec_OCRv3 | SVTR | MobileNetV1Enhance | / | / | / | PaddleOCR | devanagari_dict.txt | yaml | infer model | devanagari_PP-OCRv3_rec |
en_pp_rec_crnn_resnet34vd | CRNN | ResNet34_vd | IC15 | 66.35 | 420.80 | PaddleOCR | ic15_dict.txt | yaml | infer model | CRNN |
en_pp_rec_rosetta_resnet34vd | Rosetta | Resnet34_vd | IC15 | 64.28 | 552.40 | PaddleOCR | ic15_dict.txt | yaml | infer model | Rosetta |
en_pp_rec_vitstr_vitstr | ViTSTR | ViTSTR | IC15 | 68.42 | 364.67 | PaddleOCR | EN_symbol_dict.txt | yaml | train model | ViTSTR |
en_mm_rec_nrtr_resnet31 | NRTR | ResNet31 | IC15 | 67.26 | 32.63 | MMOCR | english_digits_symbols.txt | yaml | train model | NRTR |
en_mm_rec_satrn_shallowcnn | SATRN | ShallowCNN | IC15 | 73.52 | 32.14 | MMOCR | english_digits_symbols.txt | yaml | train model | SATRN |
name | model | dataset | Acc(%) | FPS | source | config | download | reference |
---|---|---|---|---|---|---|---|---|
ch_pp_mobile_cls_v2.0 | MobileNetV3 | / | / | / | PaddleOCR | yaml | infer model | ch_ppocr_mobile_v2.0_cls |
graph LR;
A[ThirdParty models] -- xx2onnx --> B[ONNX] -- converter_lite --> C[MindIR];
C --input --> D[infer.py] -- outputs --> eval_rec.py/eval_det.py;
H[images] --input --> D[infer.py];
Let's take ch_pp_det_OCRv4
in Third-Party Model Support List as an example to introduce the inference method:
- In Third-Party Model Support List,
infer model
indicates model file for inference;train model
indicates model file for training, and it need to be converted to inference model first. - If the model file is
infer model
, likech_pp_det_OCRv4
, dowload and extract infer model and get the following folder:ch_PP-OCRv4_det_infer/ ├── inference.pdmodel ├── inference.pdiparams ├── inference.pdiparams.info
- If the model file is
train model
, likeen_pp_det_psenet_resnet50vd
, dowload and extract train model and get the following folder:And it need to be converted by the following commands:det_r50_vd_pse_v2.0_train/ ├── train.log ├── best_accuracy.pdopt ├── best_accuracy.states ├── best_accuracy.pdparams
and you will get the following folder:git clone https://github.com/PaddlePaddle/PaddleOCR.git cd PaddleOCR python tools/export_model.py \ -c configs/det/det_r50_vd_pse.yml \ -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy \ Global.save_inference_dir=./det_db
det_db/ ├── inference.pdmodel ├── inference.pdiparams ├── inference.pdiparams.info
Download and use the paddle2onnx tool
pip install paddle2onnx
and convert the inference model into an onnx file:
paddle2onnx \
--model_dir det_db \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams\
--save_file det_db.onnx \
--opset_version 11 \
--input_shape_dict="{'x':[-1,3,-1,-1]}" \
--enable_onnx_checker True
A brief explanation of parameters for paddle2onnx is as follows:
Parameter | Description |
---|---|
--model_dir | Configures the directory path containing the Paddle model. |
--model_filename | [Optional] Configures the file name storing the network structure located under --model_dir . |
--params_filename | [Optional] Configures the file name storing the model parameters located under --model_dir . |
--save_file | Specifies the directory path for saving the converted model. |
--opset_version | [Optional] Configures the OpSet version for converting to ONNX. Multiple versions, such as 7~16, are currently supported, and the default is 9. |
--input_shape_dict | Specifies the shape of the input tensor for generating a dynamic ONNX model. The format is "{'x': [N, C, H, W]}", where -1 represents dynamic shape. |
--enable_onnx_checker | [Optional] Configures whether to check the correctness of the exported ONNX model. It is recommended to enable this switch, and the default is False. |
The value of --input_shape_dict
in the parameter can be viewed by opening the inference model through the Netron tool.
Learn more about paddle2onnx
The det_db.onnx
file will be generated after the above command is executed;
Use converter_lite
tool on Ascend310/310P to convert onnx files to mindir:
Create config.txt
and specify the model input shape:
- If converting to static shape model, like static shape of
[1,3,736,1280]
, the config is as following[ascend_context] input_format=NCHW input_shape=x:[1,3,736,1280]
- If converting to dynamic shape(scaling) model, the config is as following
[ascend_context] input_format=NCHW input_shape=x:[1,3,-1,-1] dynamic_dims=[736,1280],[768,1280],[896,1280],[1024,1280]
- If converting to dynamic shape model, the config is as following
[acl_build_options] input_format=NCHW input_shape_range=x:[-1,3,-1,-1]
A brief explanation of the configuration file parameters is as follows:
Parameter | Attribute | Function Description | Data Type | Value Description |
---|---|---|---|---|
input_format | Optional | Specify the format of the model input | String | Optional values are "NCHW", "NHWC", "ND" |
input_shape | Optional | Specify the shape of the model input. The input_name must be the input name in the original model, arranged in order of input, separated by ";" | String | For example: "input1:[1,64,64,3];input2:[1,256,256,3]" |
dynamic_dims | Optional | Specify dynamic BatchSize and dynamic resolution parameters | String | For example: "dynamic_dims=[48,520],[48,320],[48,384]" |
Learn more about Configuration File Parameters
Run the following command:
converter_lite\
--saveType=MINDIR \
--fmk=ONNX \
--optimize=ascend_oriented \
--modelFile=det_db.onnx \
--outputFile=det_db_lite \
--configFile=config.txt
After the above command is executed, the det_db_lite.mindir
file will be generated;
A brief explanation of the converter_lite
parameters is as follows:
Parameter | Required | Parameter Description | Value Range | Default | Remarks |
---|---|---|---|---|---|
fmk | Yes | Input model format | MINDIR, CAFFE, TFLITE, TF, ONNX | - | - |
saveType | No | Set the exported model to MINDIR or MS model format. | MINDIR, MINDIR_LITE | MINDIR | The cloud-side inference version can only infer models converted to MINDIR format |
modelFile | Yes | Input model path | - | - | - |
outputFile | Yes | Output model path. Do not add a suffix, ".mindir" suffix will be generated automatically. | - | - | - |
configFile | No | 1) Path to the quantization configuration file after training; 2) Path to the configuration file for extended functions | - | - | - |
optimize | No | Set the model optimization type for the device. Default is none. | none、general、gpu_oriented、ascend_oriented | - | - |
Learn more about converter_lite
Learn more about Model Conversion Tutorial
Perform inference using deploy/py_infer/infer.py
codes and det_db_lite.mindir
model file:
python deploy/py_infer/infer.py \
--input_images_dir=/path/to/ic15/ch4_test_images \
--det_model_path=/path/to/mindir/det_db_lite.mindir \
--det_model_name_or_config=ch_pp_det_OCRv4 \
--res_save_dir=/path/to/ch_pp_det_OCRv4_results
After the execution is completed, the prediction file det_results.txt
will be generated in the directory pointed to by the parameter --res_save_dir
.
When doing inference, you can use the --vis_det_save_dir
parameter to visualize the results
Learn more about infer.py inference parameters
Evaluate the results using the following command:
python deploy/eval_utils/eval_det.py\
--gt_path=/path/to/ic15/test_det_gt.txt\
--pred_path=/path/to/ch_pp_det_OCRv4_results/det_results.txt
Let's take ch_pp_rec_OCRv4
in Third-Party Model Support List as an example to introduce the inference method:
-
In Third-Party Model Support List,
infer model
indicates model file for inference;train model
indicates model file for training, and it need to be converted to inference model first. -
If the model file is
infer model
, likech_pp_rec_OCRv4
, dowload and extract infer model and get the following folder:ch_PP-OCRv4_det_infer/ ├── inference.pdmodel ├── inference.pdiparams ├── inference.pdiparams.info
-
If the model file is
train model
, likeen_pp_rec_vitstr_vitstr
, dowload and extract train model and get the following folder:rec_vitstr_none_ce_train/ ├── train.log ├── best_accuracy.pdopt ├── best_accuracy.states ├── best_accuracy.pdparams
And it need to be converted by the following commands:
git clone https://github.com/PaddlePaddle/PaddleOCR.git cd PaddleOCR python tools/export_model.py \ -c configs/rec/rec_vitstr_none_ce.yml \ -o Global.pretrained_model=./rec_vitstr_none_ce_train/best_accuracy \ Global.save_inference_dir=./rec_vitstr
and you will get the following folder:
rec_vitstr/ ├── inference.pdmodel ├── inference.pdiparams ├── inference.pdiparams.info
Download and use the paddle2onnx tool
pip install paddle2onnx
and convert the inference model into an onnx file:
paddle2onnx \
--model_dir ch_PP-OCRv4_rec_infer \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--save_file rec_crnn.onnx \
--opset_version 11 \
--input_shape_dict="{'x':[-1,3,48,-1]}" \
--enable_onnx_checker True
The rec_crnn.onnx
file will be generated after the above command is executed;
Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx
.
Use converter_lite
tool on Ascend310/310P to convert onnx files to mindir:
Create config.txt
and specify the model input shape:
- If converting to static shape model, like static shape of
[1,3,48,320]
, the config is as following[ascend_context] input_format=NCHW input_shape=x:[1,3,48,320]
- If converting to dynamic shape(scaling) model, the config is as following
[ascend_context] input_format=NCHW input_shape=x:[1,3,-1,-1] dynamic_dims=[48,520],[48,320],[48,384],[48,360],[48,394],[48,321],[48,336],[48,368],[48,328],[48,685],[48,347]
- If converting to dynamic shape model, the config is as following
[acl_build_options] input_format=NCHW input_shape_range=x:[-1,3,-1,-1]
For a brief description of the configuration parameters, please refer to 3.1.3 Convert onnx file to Lite MindIR file
Run the following command:
converter_lite \
--saveType=MINDIR \
--fmk=ONNX \
--optimize=ascend_oriented \
--modelFile=rec_crnn.onnx \
--outputFile=rec_crnn_lite \
--configFile=config.txt
After the above command is executed, the rec_crnn_lite.mindir.mindir
file will be generated;
For a brief description of the converter_lite
parameters, see the text detection example above.
Learn more about converter_lite
Learn more about Model Conversion Tutorial
According to Third-Party Model Support List, download ppocr_keys_v1.txt which matches with ch_pp_rec_OCRv4
.
Perform inference using deploy/py_infer/infer.py
codes and rec_crnn_lite.mindir
model file:
python deploy/py_infer/infer.py \
--input_images_dir=/path/to/mlt17_ch \
--rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
--rec_model_name_or_config=ch_pp_rec_OCRv4 \
--character_dict_path=/path/to/ppocr_keys_v1.txt \
--res_save_dir=/path/to/ch_rec_infer_results
After the execution is completed, the prediction file rec_results.txt
will be generated in the directory pointed to by the parameter --res_save_dir
.
Learn more about infer.py inference parameters
Evaluate the results using the following command:
python deploy/eval_utils/eval_rec.py \
--gt_path=/path/to/mlt17_ch/chinese_gt.txt \
--pred_path=/path/to/en_rec_infer_results/rec_results.txt
Refer Dataset converters for dataset preparation.
Let's take ch_pp_mobile_cls_v2
in Third-Party Model Support List as an example to introduce the inference method:
In Third-Party Model Support List
,ch_pp_mobile_cls_v2.0
is a infer model,so convertion is not needed. dowload and extract it and get the following folder:
ch_ppocr_mobile_v2.0_cls_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info
convert the inference model into an onnx file:
paddle2onnx \
--model_dir cls_mv3 \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--save_file cls_mv3.onnx \
--opset_version 11 \
--input_shape_dict="{'x':[-1,3,-1,-1]}" \
--enable_onnx_checker True
The cls_mv3.onnx
file will be generated after the above command is executed;
Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx
.
Refer to 3.1.3 Convert onnx file to Lite MindIR file and create config.txt
, here we take dynamic shape config as example
[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]
And run the following command:
converter_lite \
--saveType=MINDIR \
--fmk=ONNX \
--optimize=ascend_oriented \
--modelFile=cls_mv3.onnx \
--outputFile=cls_mv3_lite \
--configFile=config.txt
After the above command is executed, the cls_mv3_lite.mindir.mindir
file will be generated
Prepare mindIR according to Text Detection, Text Recognition, Text Direction Classification, and run the following command to do end-to-end inference
python deploy/py_infer/infer.py \
--input_images_dir=/path/to/ic15/ch4_test_images \
--det_model_path=/path/to/mindir/det_db_lite.mindir \
--det_model_name_or_config=ch_pp_det_OCRv4 \
--cls_model_path=/path/to/mindir/cls_mv3_lite.mindir \
--cls_model_name_or_config=ch_pp_mobile_cls_v2.0 \
--rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
--rec_model_name_or_config=ch_pp_rec_OCRv4 \
--character_dict_path=/path/to/ppocr_keys_v1.txt \
--res_save_dir=/path/to/infer_results
For problems about converting model and inference, please refer to FAQ for solutions.