STAPI (Section Title And Prose text Identifier) is a two-step system for labeling section titles and prose text in HTML documents. Our paper has been accepted for a Poster presentation at LREC 2022.
You can go to Software
directory to check the source code of our training pipeline. We also created a web demo here.