As part of the project we examine several approaches for handwriting text recognition based on convolutional neural networky and long short-term memories.
All aproaches follow the method to break the image down into the smaller parts:
- lines
- words
- characters
The two best approaches are explained in the written elaboration (only available in german), that you can find between the source code folders of this repository. On top of that there is a explanation of the object detection approach YOLOv1, and the End-to-End Trainable Neural Network for Image-based-Sequence Recognition which are used in all approaches.
Instructors
- Prof. Dr. Visvanathan Ramesh , email: mehler@em.uni-frankfurt.de
- Martin Mundt, email: mmundt@em.uni-frankfurt.de
Institutions
Project team
- Martin Ludwig
- Pascal Fischer
Tools
- Python 3
- PyTorch
- Pillow
- OpenCV
Dataset
We only use the data of the IAM Handwriting Database for training and testing.
The database consists of:
- 657 writers contributed samples of their handwriting
- 1'539 pages of scanned text
- 5'685 isolated and labeled sentences
- 13'353 isolated and labeled text lines
- 115'320 isolated and labeled words
All form, line and word images are provided as PNG files and the corresponding form label files, including segmentation information and variety of estimated parameters, are included in the image files as meta-information in XML format which is described in XML file and XML file format (DTD).
Results
We compare our best approach with the state-of-the-art CRNN approach by the character error rate (cer).
Approach | CER % |
---|---|
CRNN | 5.7 |
Our best | 10.64 |
Source Code
The source code of all approaches are available in the .pynb
Python formats in the way of google-colab