You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have issue when I add the Thai language to Scribe OCR as follows:
I just add the tha.traineddata.gz to \tess\lang but the Console log show "Error: Tesseract (legacy) engine requested, but components are not present in ./tha.traineddata!! Failed loading language 'tha'".
Sometime, I encountered "'Error opening data file tessdata/tha.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'tha' Tesseract couldn't load any languages! Could not initialize tesseract.'".
Do you have the instruction or plan for adding the Thai language to the Scribe OCR?
The text was updated successfully, but these errors were encountered:
Thai should eventually be supported, however there are a couple of issues that make this more difficult compared to other languages.
First, the error message you pasted indicates we are not getting the correct language data from Tesseract.js. This is a good catch, and should be patched. I opened a Git Issue in that repo, and will fix at some point. naptha/tesseract.js#931
Second, adding Thai involves adding a new script. Adding new languages that use Latin script is trivial, as English, Spanish, French, etc. all use the same font files. However, adding Thai will involve adding additional font resources, as well as code to load and switch between them.
All of the above is completely doable, however is more involved than simply adding Thai to the list of languages in the UI.
I have issue when I add the Thai language to Scribe OCR as follows:
Do you have the instruction or plan for adding the Thai language to the Scribe OCR?
The text was updated successfully, but these errors were encountered: