Arabic Dates OCR

This repo contains the code for Optical Character Recognition (OCR) for dates written in Arabic numbers. The code is written using Tensorflow.

Note: The ArabicDatesVocabulary class creates a hash table for mapping arabic characters to labels. For some reason my Windows device did not encode arabic characters despite working fine on Google Colab. So, I guess this is a device specific problem.

Setup

Create virtual environment

$ python -m venv .venv
# Linux
$ source .venv/bin/activate
# Windows
$ .venv\Scripts\activate

Install dependencies

$ pip install -r requirements.txt

Training the Model

A training script is provided for training on your own data. Additionally, you can provide your own checkpoints.

$ python ocrnet/train.py --model=./models/model --train_dataset=./dataset/ --output_path=./ocr_model/

Inferencing

There is a script provided for inference on your own data. Additionally, you can provide your own checkpoints.

$ python ocrnet/inference.py --model=./models/model --test_dataset=./dataset/

Model Conversion

You can convert the model to ONNX using the command below.

$ python -m tf2onnx.convert --saved-model ./models/model --output ./models/model.onnx

To convert to TensorRT, install tensorrt then, run the command below.

$ python ocrnet/converter.py --input_name=./models/model --ouptut=./models/trt_model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Arabic Dates OCR

Setup

Training the Model

Inferencing

Model Conversion

Files

README.md

Latest commit

History

README.md

File metadata and controls

Arabic Dates OCR

Setup

Training the Model

Inferencing

Model Conversion