Support for non-Latin characters #8

robertknight · 2024-01-03T15:25:52Z

ocrs should support models that can recognize non-Latin text.

Essential sub-tasks:

Eliminate the hard-coded alphabet from the recognition process. IIRC this was from an EasyOCR model that I used at one point. (feat: customizable alphabet using OcrEngineParams #100)

Additional sub-tasks:

Create a proof-of-concept recognition model that recognizes a non-Latin alphabet
Create a standard file format for model metadata (including the alphabet) and support loading it. For example a config.json file can be shipped alongside the model and contains the alphabet.

The text was updated successfully, but these errors were encountered:

DehaiWang · 2024-01-06T06:11:21Z

will support Chinese character？

robertknight · 2024-01-06T06:43:32Z

The goal is to make this possible. There are a lot of details still to be figured out.

robertknight · 2024-09-01T05:30:43Z

Eliminating the hard-coded alphabet from the recognition process. IIRC this was from an EasyOCR model that I used at one point.

This was completed in #100.

robertknight mentioned this issue Jan 7, 2024

Roadmap for 2024 #14

Open

6 tasks

xring mentioned this issue Jan 8, 2024

Add document for supported Characters/Languages in README #16

Closed

Provide feedback