-
Notifications
You must be signed in to change notification settings - Fork 3
Software
Clemens Neudecker edited this page Jun 16, 2016
·
36 revisions
- ABBYY FineReader Engine is an OCR SDK that gives developers, integrators and BPOs the tools they require to integrate optical text recognition technologies into their applications.
- CCS docWorks is a software program used by the most renowned libraries, publishing houses, and companies worldwide to digitize and convert their valuable library holdings and archives for easy access, searchability, and long-term preservation.
- OCLC CONTENTdm makes everything in your digital collections available to everyone, everywhere. No matter the format — local history archives, newspapers, books, maps, slide libraries or audio/video — CONTENTdm can handle the storage, management and delivery of your collections to users across the Web.
- Veridian is presentation software that makes it easy to search, view, and interact with digital collections on the Internet. Veridian supports almost any type of content such as books, magazines, journals, newspapers, photographs, maps, and audio/video files and makes them easily accessible to anyone online.
-
BNLViewer
METS/ALTO viewer written in Java and Javascript from National Library of Luxembourg
- Aletheia (an advanced document analysis system) as well as other commercial and/or open source PRImA tools such as OCR text and layout performance evaluation, viewers, and converters support ALTO as input format.
-
https://github.com/UB-Mannheim/ocr-transform
Convert between Tesseract hOCR and ALTO XML 2.0/2.1 using XSL stylesheets -
https://github.com/edsu/alto-words
This is a simplistic demonstration of how you can calculate the ratio of dictionary words to all words in a METS Alto OCR xml file
-
https://github.com/tokee/quack
An enhanced ALTO-viewer for Quality Assurance oriented display of a collections of scans, typically from books or newspapers. -
https://github.com/tokee/alto-ocr-cleanup
Experiments with cleanup of dirty ALTO OCR files using anagram hashing. -
https://github.com/ironymark/AbbyyToAlto
This is a simple Converter written in PHP5 to convert Abbyy FineReader XML into the ALTO XML document format. -
https://github.com/Mewel/abbyy-to-alto
A simple Java based tool to convert Abbyy FineReader XML to ALTO XML. -
https://github.com/KBNLresearch/alto-editor
Browser based post-correction tool for Alto XML files, version 1 -
https://github.com/renevanderark/altoedit-2.0
Browser based post-correction tool for Alto XML files, version 2 -
https://github.com/cneud/alto-tools
Python script for various operations on ALTO files -
https://github.com/EuropeanaNewspapers/europeananp-ner
Named Entity Recognition based on Stanford Named Entity Recognizer with support for ALTO -
http://dfg-viewer.de/en/regarding-the-project/
Browser web service for displaying digital representations from decentralized library repositories -
https://github.com/impactcentre/ocrevalUAtion
Evaluation of OCR and a reference text (multiple formats supported, incl. ALTO) -
https://github.com/INL/OpenConvert
OCR/Text format conversion tool, supports ALTO as input format to create TEI, Folia