@axa-fr/ocr

About
How to consume
Contribute

About

The axa-fr-ocr is an utility library built on top of Tesseract and providing algorithms to deskew text in order to enhance OCR performance. It also provides an algorithm to determine whether an image contains text.

How to consume

pip install axa-fr-ocr

import time
from io import BytesIO

import cv2
import numpy as np
import logging

from axa_fr_ocr.ocr import Ocr
from axa_fr_ocr.image import normalize_size
from axa_fr_ocr.text.deskew import deskew
from axa_fr_ocr.text.orientation import orientate
from axa_fr_ocr.text.text import is_image_contain_text


with open('.path.pdf', 'w') as stream:
    nparr = np.frombuffer(stream.read(), np.uint8)
    image_origin = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    image_cv, ratio_cv = normalize_size(image_origin, 2200)
    small_img_cv, ratio = normalize_size(image_cv, 800)
    is_img_with_text, distance, img_mser_cv = is_image_contain_text(small_img_cv)
    if not is_img_with_text:
      print("No text")
    else:
        img_straighted_cv, angle_orientation, duration_orientation = orientate(img_deskew_cv)
        img_straighted = BytesIO(cv2.imencode(".png", img_straighted_cv)[1].tobytes())
        ocr = Ocr(logging)
        img_text, ocr_confidence = ocr.process_ocr(img_straighted)
        print(img_text)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ISSUE_TEMPLATE.md		ISSUE_TEMPLATE.md
LICENCE.txt		LICENCE.txt
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.py		setup.py
test-requirements.txt		test-requirements.txt
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@axa-fr/ocr

About

How to consume

Contribute

About

Releases

Packages

Contributors 4

Languages

License

AxaFrance/axa-fr-ocr

Folders and files

Latest commit

History

Repository files navigation

@axa-fr/ocr

About

How to consume

Contribute

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages