Skip to content

Latest commit

 

History

History
18 lines (12 loc) · 347 Bytes

README.md

File metadata and controls

18 lines (12 loc) · 347 Bytes

St. Petersburg Corpus of Hagiographic Texts

Old Church Slavic corpus

http://project.phil.spbu.ru/scat/page.php?page=project

Parser

Run to get entire xml text.

./tei_parser.py xml/Aleksandr_svirskij.xml

TODO:

  • return text sentence by sentence
  • return text clause by clause
  • keep info about named entities (<name> tag)