Skip to content

cmroughan/subimg_augmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

subimg_augmentation

This repository contains code for extracting polygonal segmentation data from ALTO XML files to use in subimage augmentation, as presented in "Evaluating Augmented Training Data for Complex Document Layouts: the Case of Arabic Scientific Manuscripts" (DH2024). The code is available both as a Python script (extract-regions.py) and a Jupyter notebook.

The method for creating artificial images using these extracted regions is the choice of the user. A sample workflow that combines together select regions using a SegmOnto ontology will soon be uploaded to this repository.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published