-
Notifications
You must be signed in to change notification settings - Fork 3
Software
Jean-Philippe Moreux edited this page Jun 30, 2016
·
36 revisions
- ABBYY FineReader Engine is an OCR SDK that gives developers, integrators and BPOs the tools they require to integrate optical text recognition technologies into their applications.
- docWorks is a software solution to digitize and convert library holdings and archives for easy access, searchability, and long-term preservation. docWorks generates as default METS/ALTO output, in addition it offers the transformation of the output into further formats like ePUB, PDF, plain-text, RTF or others.
- OCLC CONTENTdm makes everything in your digital collections available to everyone, everywhere. No matter the format — local history archives, newspapers, books, maps, slide libraries or audio/video — CONTENTdm can handle the storage, management and delivery of your collections to users across the Web.
- Veridian is presentation software that makes it easy to search, view, and interact with digital collections on the Internet. Veridian supports almost any type of content such as books, magazines, journals, newspapers, photographs, maps, and audio/video files and makes them easily accessible to anyone online.
-
BNLViewer
METS/ALTO viewer written in Java and Javascript from National Library of Luxembourg -
https://github.com/tokee/quack
An enhanced ALTO-viewer for Quality Assurance oriented display of a collections of scans, typically from books or newspapers. -
http://dfg-viewer.de/en/regarding-the-project/
Browser web service for displaying digital representations from decentralized library repositories
- Aletheia (an advanced document analysis system) as well as other commercial and/or open source PRImA tools such as OCR text and layout performance evaluation, viewers, and converters support ALTO as input format.
-
https://github.com/KBNLresearch/alto-editor
Browser based post-correction tool for Alto XML files, version 1 -
https://github.com/renevanderark/altoedit-2.0
Browser based post-correction tool for Alto XML files, version 2 -
https://github.com/cneud/alto-tools
Python script for various operations on ALTO files -
https://github.com/EuropeanaNewspapers/europeananp-ner
Named Entity Recognition based on Stanford Named Entity Recognizer with support for ALTO -
https://github.com/impactcentre/ocrevalUAtion
Evaluation of OCR and a reference text (multiple formats supported, incl. ALTO)
-
https://github.com/UB-Mannheim/ocr-transform
Convert between Tesseract hOCR and ALTO XML 2.0/2.1 using XSL stylesheets -
https://github.com/ironymark/AbbyyToAlto
This is a simple Converter written in PHP5 to convert Abbyy FineReader XML into the ALTO XML document format. -
https://github.com/Mewel/abbyy-to-alto
A simple Java based tool to convert Abbyy FineReader XML to ALTO XML. -
https://github.com/INL/OpenConvert
OCR/Text format conversion tool, supports ALTO as input format to create TEI, Folia - https://github.com/altomator/ALTO-HTML
ALTO to HTML batch converter dealing with the ALTO tags feature (tags were introduced in ALTO v2). Based on XSLT and DOS scripts.
-
https://github.com/edsu/alto-words
This is a simplistic demonstration of how you can calculate the ratio of dictionary words to all words in a METS Alto OCR xml file -
https://github.com/tokee/alto-ocr-cleanup
Experiments with cleanup of dirty ALTO OCR files using anagram hashing. - https://github.com/altomator/EN-data_mining
METS/ALTO data mining tool: Extraction of quantitative metadata from METS/ALTO newspapers documents. Based on XSLT or Perl scripts. See also http://altomator.github.io/EN-data_mining