Third-party Models Offline Inference - Quick Start

1. Third-Party Model Support List

MindOCR supports the inference of third-party models (PaddleOCR, MMOCR, etc.), and this document displays a list of adapted models. The performance test is based on Ascend310P, and some models have no test data set yet.

1.1 Text Detection

name	model	backbone	dataset	F-score(%)	FPS	source	config	download	reference
ch_pp_det_OCRv4	DBNet	MobileNetV3	/	/	/	PaddleOCR	yaml	infer model	ch_PP-OCRv4_det
ch_pp_server_det_v2.0	DBNet	ResNet18_vd	MLT17	46.22	21.65	PaddleOCR	yaml	infer model	ch_ppocr_server_v2.0_det
ch_pp_det_OCRv3	DBNet	MobileNetV3	MLT17	33.89	22.40	PaddleOCR	yaml	infer model	ch_PP-OCRv3_det
ch_pp_det_OCRv2	DBNet	MobileNetV3	MLT17	42.99	21.90	PaddleOCR	yaml	infer model	ch_PP-OCRv2_det
ch_pp_mobile_det_v2.0_slim	DBNet	MobileNetV3	MLT17	31.66	19.88	PaddleOCR	yaml	infer model	ch_ppocr_mobile_slim_v2.0_det
ch_pp_mobile_det_v2.0	DBNet	MobileNetV3	MLT17	31.56	21.96	PaddleOCR	yaml	infer model	ch_ppocr_mobile_v2.0_det
en_pp_det_OCRv3	DBNet	MobileNetV3	IC15	42.14	55.55	PaddleOCR	yaml	infer model	en_PP-OCRv3_det
ml_pp_det_OCRv3	DBNet	MobileNetV3	MLT17	66.01	22.48	PaddleOCR	yaml	infer model	ml_PP-OCRv3_det
en_pp_det_dbnet_resnet50vd	DBNet	ResNet50_vd	IC15	79.89	21.17	PaddleOCR	yaml	infer model	DBNet
en_pp_det_psenet_resnet50vd	PSE	ResNet50_vd	IC15	80.44	7.75	PaddleOCR	yaml	train model	PSE
en_pp_det_east_resnet50vd	EAST	ResNet50_vd	IC15	85.58	20.70	PaddleOCR	yaml	train model	EAST
en_pp_det_sast_resnet50vd	SAST	ResNet50_vd	IC15	81.77	22.14	PaddleOCR	yaml	train model	SAST
en_mm_det_dbnetpp_resnet50	DBNet++	ResNet50	IC15	81.36	10.66	MMOCR	yaml	train model	DBNetpp
en_mm_det_fcenet_resnet50	FCENet	ResNet50	IC15	83.67	3.34	MMOCR	yaml	train model	FCENet

Notice: When using the en_pp_det_psenet_resnet50vd model for inference, you need to modify the onnx file with the following command

python deploy/models_utils/onnx_optim/insert_pse_postprocess.py \
      --model_path=./pse_r50vd.onnx \
      --binary_thresh=0.0 \
      --scale=1.0

1.2 Text recognition

name	model	backbone	dataset	Acc(%)	FPS	source	dict file	config	download	reference
ch_pp_rec_OCRv4	CRNN	MobileNetV1Enhance	/	/	/	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_PP-OCRv4_rec
ch_pp_server_rec_v2.0	CRNN	ResNet34	MLT17 (ch)	49.91	154.16	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_ppocr_server_v2.0_rec
ch_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	MLT17 (ch)	49.91	408.38	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_PP-OCRv3_rec
ch_pp_rec_OCRv2	CRNN	MobileNetV1Enhance	MLT17 (ch)	44.59	203.34	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_PP-OCRv2_rec
ch_pp_mobile_rec_v2.0	CRNN	MobileNetV3	MLT17 (ch)	24.59	167.67	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_ppocr_mobile_v2.0_rec
en_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	MLT17 (en)	79.79	917.01	PaddleOCR	en_dict.txt	yaml	infer model	en_PP-OCRv3_rec
en_pp_mobile_rec_number_v2.0_slim	CRNN	MobileNetV3	/	/	/	PaddleOCR	en_dict.txt	yaml	infer model	en_number_mobile_slim_v2.0_rec
en_pp_mobile_rec_number_v2.0	CRNN	MobileNetV3	/	/	/	PaddleOCR	en_dict.txt	yaml	infer model	en_number_mobile_v2.0_rec
korean_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	korean_dict.txt	yaml	infer model	korean_PP-OCRv3_rec
japan_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	japan_dict.txt	yaml	infer model	japan_PP-OCRv3_rec
chinese_cht_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	chinese_cht_dict.txt	yaml	infer model	chinese_cht_PP-OCRv3_rec
te_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	te_dict.txt	yaml	infer model	te_PP-OCRv3_rec
ka_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	ka_dict.txt	yaml	infer model	ka_PP-OCRv3_rec
ta_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	ta_dict.txt	yaml	infer model	ta_PP-OCRv3_rec
latin_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	latin_dict.txt	yaml	infer model	latin_PP-OCRv3_rec
arabic_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	arabic_dict.txt	yaml	infer model	arabic_PP-OCRv3_rec
cyrillic_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	cyrillic_dict.txt	yaml	infer model	cyrillic_PP-OCRv3_rec
devanagari_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	devanagari_dict.txt	yaml	infer model	devanagari_PP-OCRv3_rec
en_pp_rec_crnn_resnet34vd	CRNN	ResNet34_vd	IC15	66.35	420.80	PaddleOCR	ic15_dict.txt	yaml	infer model	CRNN
en_pp_rec_rosetta_resnet34vd	Rosetta	Resnet34_vd	IC15	64.28	552.40	PaddleOCR	ic15_dict.txt	yaml	infer model	Rosetta
en_pp_rec_vitstr_vitstr	ViTSTR	ViTSTR	IC15	68.42	364.67	PaddleOCR	EN_symbol_dict.txt	yaml	train model	ViTSTR
en_mm_rec_nrtr_resnet31	NRTR	ResNet31	IC15	67.26	32.63	MMOCR	english_digits_symbols.txt	yaml	train model	NRTR
en_mm_rec_satrn_shallowcnn	SATRN	ShallowCNN	IC15	73.52	32.14	MMOCR	english_digits_symbols.txt	yaml	train model	SATRN

1.3 Text angle classification

name	model	dataset	Acc(%)	FPS	source	config	download	reference
ch_pp_mobile_cls_v2.0	MobileNetV3	/	/	/	PaddleOCR	yaml	infer model	ch_ppocr_mobile_v2.0_cls

2. Overview of Third-Party Inference

graph LR;
    A[ThirdParty models] -- xx2onnx --> B[ONNX] -- converter_lite --> C[MindIR];
    C --input --> D[infer.py] -- outputs --> eval_rec.py/eval_det.py;
    H[images] --input --> D[infer.py];

Loading

3. Third-Party Model Inference Methods

3.1 Text Detection

Let's take ch_pp_det_OCRv4 in Third-Party Model Support List as an example to introduce the inference method:

3.1.1 Download Thirdparty model file

In Third-Party Model Support List, infer model indicates model file for inference; train model indicates model file for training, and it need to be converted to inference model first.

If the model file is infer model, like ch_pp_det_OCRv4, dowload and extract infer model and get the following folder:

ch_PP-OCRv4_det_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

If the model file is train model, like en_pp_det_psenet_resnet50vd, dowload and extract train model and get the following folder:

det_r50_vd_pse_v2.0_train/
├── train.log
├── best_accuracy.pdopt
├── best_accuracy.states
├── best_accuracy.pdparams

And it need to be converted by the following commands:

git clone https://github.com/PaddlePaddle/PaddleOCR.git
cd PaddleOCR
python tools/export_model.py \
    -c configs/det/det_r50_vd_pse.yml \
    -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy  \
    Global.save_inference_dir=./det_db

and you will get the following folder:

det_db/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.1.2 Convert the thirdparty model to onnx file

Download and use the paddle2onnx tool

pip install paddle2onnx

and convert the inference model into an onnx file:

paddle2onnx \
     --model_dir det_db \
     --model_filename inference.pdmodel \
     --params_filename inference.pdiparams\
     --save_file det_db.onnx \
     --opset_version 11 \
     --input_shape_dict="{'x':[-1,3,-1,-1]}" \
     --enable_onnx_checker True

A brief explanation of parameters for paddle2onnx is as follows:

Parameter	Description
--model_dir	Configures the directory path containing the Paddle model.
--model_filename	[Optional] Configures the file name storing the network structure located under `--model_dir`.
--params_filename	[Optional] Configures the file name storing the model parameters located under `--model_dir`.
--save_file	Specifies the directory path for saving the converted model.
--opset_version	[Optional] Configures the OpSet version for converting to ONNX. Multiple versions, such as 7~16, are currently supported, and the default is 9.
--input_shape_dict	Specifies the shape of the input tensor for generating a dynamic ONNX model. The format is "{'x': [N, C, H, W]}", where -1 represents dynamic shape.
--enable_onnx_checker	[Optional] Configures whether to check the correctness of the exported ONNX model. It is recommended to enable this switch, and the default is False.

The value of --input_shape_dict in the parameter can be viewed by opening the inference model through the Netron tool.

Learn more about paddle2onnx

The det_db.onnx file will be generated after the above command is executed;

3.1.3 Convert onnx file to Lite MindIR file

Use converter_lite tool on Ascend310/310P to convert onnx files to mindir:

Create config.txt and specify the model input shape:

If converting to static shape model, like static shape of [1,3,736,1280], the config is as following
```
[ascend_context]
input_format=NCHW
input_shape=x:[1,3,736,1280]
```

If converting to dynamic shape(scaling) model, the config is as following

[ascend_context]
input_format=NCHW
input_shape=x:[1,3,-1,-1]
dynamic_dims=[736,1280],[768,1280],[896,1280],[1024,1280]

If converting to dynamic shape model, the config is as following

[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]

A brief explanation of the configuration file parameters is as follows:

Parameter	Attribute	Function Description	Data Type	Value Description
input_format	Optional	Specify the format of the model input	String	Optional values are "NCHW", "NHWC", "ND"
input_shape	Optional	Specify the shape of the model input. The input_name must be the input name in the original model, arranged in order of input, separated by ";"	String	For example: "input1:[1,64,64,3];input2:[1,256,256,3]"
dynamic_dims	Optional	Specify dynamic BatchSize and dynamic resolution parameters	String	For example: "dynamic_dims=[48,520],[48,320],[48,384]"

Learn more about Configuration File Parameters

Run the following command:

converter_lite\
     --saveType=MINDIR \
     --fmk=ONNX \
     --optimize=ascend_oriented \
     --modelFile=det_db.onnx \
     --outputFile=det_db_lite \
     --configFile=config.txt

After the above command is executed, the det_db_lite.mindir file will be generated;

A brief explanation of the converter_lite parameters is as follows:

Parameter	Required	Parameter Description	Value Range	Default	Remarks
fmk	Yes	Input model format	MINDIR, CAFFE, TFLITE, TF, ONNX	-	-
saveType	No	Set the exported model to MINDIR or MS model format.	MINDIR, MINDIR_LITE	MINDIR	The cloud-side inference version can only infer models converted to MINDIR format
modelFile	Yes	Input model path	-	-	-
outputFile	Yes	Output model path. Do not add a suffix, ".mindir" suffix will be generated automatically.	-	-	-
configFile	No	1) Path to the quantization configuration file after training; 2) Path to the configuration file for extended functions	-	-	-
optimize	No	Set the model optimization type for the device. Default is none.	none、general、gpu_oriented、ascend_oriented	-	-

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.1.4 Inference with Lite MindIR

Perform inference using deploy/py_infer/infer.py codes and det_db_lite.mindir model file:

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/det_db_lite.mindir \
    --det_model_name_or_config=ch_pp_det_OCRv4 \
    --res_save_dir=/path/to/ch_pp_det_OCRv4_results

After the execution is completed, the prediction file det_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

When doing inference, you can use the --vis_det_save_dir parameter to visualize the results

Learn more about infer.py inference parameters

3.1.5 Evalution

Evaluate the results using the following command:

python deploy/eval_utils/eval_det.py\
    --gt_path=/path/to/ic15/test_det_gt.txt\
    --pred_path=/path/to/ch_pp_det_OCRv4_results/det_results.txt

3.2 Text Recognition

Let's take ch_pp_rec_OCRv4 in Third-Party Model Support List as an example to introduce the inference method:

3.2.1 Download Thirdparty model file

In Third-Party Model Support List, infer model indicates model file for inference; train model indicates model file for training, and it need to be converted to inference model first.

If the model file is infer model, like ch_pp_rec_OCRv4, dowload and extract infer model and get the following folder:

ch_PP-OCRv4_det_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

If the model file is train model, like en_pp_rec_vitstr_vitstr, dowload and extract train model and get the following folder:

rec_vitstr_none_ce_train/
├── train.log
├── best_accuracy.pdopt
├── best_accuracy.states
├── best_accuracy.pdparams

And it need to be converted by the following commands:

git clone https://github.com/PaddlePaddle/PaddleOCR.git
cd PaddleOCR
python tools/export_model.py \
    -c configs/rec/rec_vitstr_none_ce.yml \
    -o Global.pretrained_model=./rec_vitstr_none_ce_train/best_accuracy  \
    Global.save_inference_dir=./rec_vitstr

and you will get the following folder:

rec_vitstr/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.2.2 Convert the thirdparty model to onnx file

Download and use the paddle2onnx tool

pip install paddle2onnx

and convert the inference model into an onnx file:

paddle2onnx \
    --model_dir ch_PP-OCRv4_rec_infer \
    --model_filename inference.pdmodel \
    --params_filename inference.pdiparams \
    --save_file rec_crnn.onnx \
    --opset_version 11 \
    --input_shape_dict="{'x':[-1,3,48,-1]}" \
    --enable_onnx_checker True

The rec_crnn.onnx file will be generated after the above command is executed;

Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx.

3.2.3 Convert onnx file to Lite MindIR file

Use converter_lite tool on Ascend310/310P to convert onnx files to mindir:

Create config.txt and specify the model input shape:

If converting to static shape model, like static shape of [1,3,48,320], the config is as following
```
[ascend_context]
input_format=NCHW
input_shape=x:[1,3,48,320]
```

If converting to dynamic shape(scaling) model, the config is as following

[ascend_context]
input_format=NCHW
input_shape=x:[1,3,-1,-1]
dynamic_dims=[48,520],[48,320],[48,384],[48,360],[48,394],[48,321],[48,336],[48,368],[48,328],[48,685],[48,347]

If converting to dynamic shape model, the config is as following

[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]

For a brief description of the configuration parameters, please refer to 3.1.3 Convert onnx file to Lite MindIR file

Run the following command:

converter_lite \
    --saveType=MINDIR \
    --fmk=ONNX \
    --optimize=ascend_oriented \
    --modelFile=rec_crnn.onnx \
    --outputFile=rec_crnn_lite \
    --configFile=config.txt

After the above command is executed, the rec_crnn_lite.mindir.mindir file will be generated;

For a brief description of the converter_lite parameters, see the text detection example above.

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.2.4 Download the Dictionary File for Recognition

According to Third-Party Model Support List, download ppocr_keys_v1.txt which matches with ch_pp_rec_OCRv4.

3.2.5 Inference with Lite MindIR

Perform inference using deploy/py_infer/infer.py codes and rec_crnn_lite.mindir model file:

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/mlt17_ch \
    --rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
    --rec_model_name_or_config=ch_pp_rec_OCRv4 \
    --character_dict_path=/path/to/ppocr_keys_v1.txt \
    --res_save_dir=/path/to/ch_rec_infer_results

After the execution is completed, the prediction file rec_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

Learn more about infer.py inference parameters

3.2.6 Evalution

Evaluate the results using the following command:

python deploy/eval_utils/eval_rec.py \
    --gt_path=/path/to/mlt17_ch/chinese_gt.txt \
    --pred_path=/path/to/en_rec_infer_results/rec_results.txt

Refer Dataset converters for dataset preparation.

3.3 Text Direction Classification

Let's take ch_pp_mobile_cls_v2 in Third-Party Model Support List as an example to introduce the inference method:

3.3.1 Download Thirdparty model file

In Third-Party Model Support List ，ch_pp_mobile_cls_v2.0 is a infer model，so convertion is not needed. dowload and extract it and get the following folder:

ch_ppocr_mobile_v2.0_cls_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.3.2 Convert the thirdparty model to onnx file

convert the inference model into an onnx file:

paddle2onnx \
    --model_dir cls_mv3 \
    --model_filename inference.pdmodel \
    --params_filename inference.pdiparams \
    --save_file cls_mv3.onnx \
    --opset_version 11 \
    --input_shape_dict="{'x':[-1,3,-1,-1]}" \
    --enable_onnx_checker True

The cls_mv3.onnx file will be generated after the above command is executed;

Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx.

3.3.3 Convert onnx file to Lite MindIR file

Refer to 3.1.3 Convert onnx file to Lite MindIR file and create config.txt, here we take dynamic shape config as example

[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]

And run the following command:

converter_lite \
    --saveType=MINDIR \
    --fmk=ONNX \
    --optimize=ascend_oriented \
    --modelFile=cls_mv3.onnx \
    --outputFile=cls_mv3_lite \
    --configFile=config.txt

After the above command is executed, the cls_mv3_lite.mindir.mindir file will be generated

3.4 End to End Inference

Prepare mindIR according to Text Detection, Text Recognition, Text Direction Classification, and run the following command to do end-to-end inference

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/det_db_lite.mindir \
    --det_model_name_or_config=ch_pp_det_OCRv4 \
    --cls_model_path=/path/to/mindir/cls_mv3_lite.mindir \
    --cls_model_name_or_config=ch_pp_mobile_cls_v2.0 \
    --rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
    --rec_model_name_or_config=ch_pp_rec_OCRv4 \
    --character_dict_path=/path/to/ppocr_keys_v1.txt \
    --res_save_dir=/path/to/infer_results

4.FAQ about converting and inference

For problems about converting model and inference, please refer to FAQ for solutions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference_thirdparty_quickstart.md

inference_thirdparty_quickstart.md

Third-party Models Offline Inference - Quick Start

1. Third-Party Model Support List

1.1 Text Detection

1.2 Text recognition

1.3 Text angle classification

2. Overview of Third-Party Inference

3. Third-Party Model Inference Methods

3.1 Text Detection

3.1.1 Download Thirdparty model file

3.1.2 Convert the thirdparty model to onnx file

3.1.3 Convert onnx file to Lite MindIR file

3.1.4 Inference with Lite MindIR

3.1.5 Evalution

3.2 Text Recognition

3.2.1 Download Thirdparty model file

3.2.2 Convert the thirdparty model to onnx file

3.2.3 Convert onnx file to Lite MindIR file

3.2.4 Download the Dictionary File for Recognition

3.2.5 Inference with Lite MindIR

3.2.6 Evalution

3.3 Text Direction Classification

3.3.1 Download Thirdparty model file

3.3.2 Convert the thirdparty model to onnx file

3.3.3 Convert onnx file to Lite MindIR file

3.4 End to End Inference

4.FAQ about converting and inference

Files

inference_thirdparty_quickstart.md

Latest commit

History

inference_thirdparty_quickstart.md

File metadata and controls

Third-party Models Offline Inference - Quick Start

1. Third-Party Model Support List

1.1 Text Detection

1.2 Text recognition

1.3 Text angle classification

2. Overview of Third-Party Inference

3. Third-Party Model Inference Methods

3.1 Text Detection

3.1.1 Download Thirdparty model file

3.1.2 Convert the thirdparty model to onnx file

3.1.3 Convert onnx file to Lite MindIR file

3.1.4 Inference with Lite MindIR

3.1.5 Evalution

3.2 Text Recognition

3.2.1 Download Thirdparty model file

3.2.2 Convert the thirdparty model to onnx file

3.2.3 Convert onnx file to Lite MindIR file

3.2.4 Download the Dictionary File for Recognition

3.2.5 Inference with Lite MindIR

3.2.6 Evalution

3.3 Text Direction Classification

3.3.1 Download Thirdparty model file

3.3.2 Convert the thirdparty model to onnx file

3.3.3 Convert onnx file to Lite MindIR file

3.4 End to End Inference

4.FAQ about converting and inference