The Differential Binarization (DB) Algorithm is one of the cutting-edge approaches to effectively detect curved text.
- Improved Text Detection: The algorithm excels at accurately identifying text within images, even when it's curved or distorted.
- Accurate Text Recognition: It paves the way for more precise text recognition, ensuring that the text is correctly extracted and understood.
The dependencies are listed in requirements.txt. Please install and follow the command below:
pip install -r requirements.txt
Please download the ICDAR2015 and TotalText dataset and set up the folder structure:
dataset/icdar2015
|test
|____|gt
|____|______gt_img_1.txt
|____|______gt_img_2.txt
|____|images
|____|______img_1.jpg
|____|______img_2.jpg
|train
|____|gt
|____|______gt_img_1.txt
|____|______gt_img_2.txt
|____|images
|____|______img_1.jpg
|____|______img_2.jpg
Before training, please modify configurations in src/configs/det_icdar2015.yml
python -m src.train
python -m src.evaluate
python -m src.predict --image_path <path_to_image>
Example:
python -m src.predict --image_path images/example.jpg
test1.png | test2.png |
---|---|
Export format | image size | mAP | mAP_50 | mAP_75 | Inference time (RTX3060) | learning rate |
---|---|---|---|---|---|---|
Pytorch - ResNet18 | 736x736 | 0.36 | 0.65 | 0.36 | 0.003s | 0.0005 |
TorchScript - ResNet18 | 736x736 | 0.36 | 0.65 | 0.36 | 0.0018s | 0.0005 |
Pytorch - ResNet50 | 736x736 | 0.40 | 0.70 | 0.40 | 0.003s | 0.007 |
TorchScript - ResNet18 | 736x736 | 0.40 | 0.70 | 0.40 | 0.004s | 0.0007 |