Skip to content

BengaliAI/SylhetiNagriOCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sylheti-Nagri-OCR

Execution

  • run scripts/resource_gen.ipynb with proper variable paths to generate:

    • Synthetic Grapheme based dictionary
    • Single line vocab file (such file are needed for synthtiger,EasyOCR etc)
    • Multi line vocab file (such file are needed for synthindic)
    • Train data unique word based dictionary
    • Separates folds (train and test) and creates data.txt where absolute path of the image and label are tab separated
  • Synthtiger: follow advanced usage section

    • needed scripts are stored under scripts/synthtiger

    • Font Customization:

      python scripts/synthtiger/extract_font_charset.py -w 1 /home/apsisdev/OCR/SylhetiNagri/fonts/

System Info:

OS          : Ubuntu 22.04.2 LTS x86_64 
Host        : Z490 GAMING X AX -CF 
Kernel      : 5.19.0-38-generic 
Shell       : bash 5.1.16 
DE          : GNOME 42.5 
CPU         : Intel i9-10900K (20) @ 5.300GHz 
GPU         : NVIDIA GeForce RTX 3090 
Memory      : 13503MiB / 32022MiB 

TODO

  • github installation and configuration
  • resource_gen documentation
  • synthtiger generation
  • synthindic generation
  • tfrs2 records
  • training notebook
  • inference notebook
  • evaluation on real data

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published