Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art

by Dimitrios Tsirmpas, Ioannis Gkionis, Ioannis Mademlis, Georgios Papadopoulos

This paper has been submitted in Engineering Applications of Artificial Intelligence.

Abstract

The adoption of Deep Neural Networks (DNNs) has greatly benefited Natural Language Processing (NLP) during the past decade. However, the demands of long document analysis are quite different from those of shorter texts, while the ever increasing size of documents uploaded on-line renders automated understanding of lengthy texts a critical issue. Relevant applications include automated Web mining, legal document review, medical records analysis, financial reports analysis, contract management, environmental impact assessment, news aggregation, etc. Despite the relatively recent development of efficient algorithms for analyzing long documents, practical tools in this field are currently flourishing. This article serves as an entry point into this dynamic domain and aims to achieve two objectives. Firstly, it provides an overview of the relevant neural building blocks, serving as a concise tutorial for the field. Secondly, it offers a brief examination of the current state-of-the-art in long document NLP, with a primary focus on two key tasks: document classification and document summarization. Sentiment analysis for long texts is also covered, since it is typically treated as a particular case of document classification. Consequently, this article presents an introductory exploration of document-level analysis, addressing the primary challenges, concerns, and existing solutions. Finally, the article presents publicly available annotated datasets that can facilitate further research in this area.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
LICENSE		LICENSE
README.md		README.md
elsarticle-num-names.bst		elsarticle-num-names.bst
elsarticle-num.bst		elsarticle-num.bst
elsarticle.dtx		elsarticle.dtx
elsarticle.ins		elsarticle.ins
manuscript.bbl		manuscript.bbl
manuscript.log		manuscript.log
manuscript.spl		manuscript.spl
manuscript.synctex(busy)		manuscript.synctex(busy)
manuscript.tex		manuscript.tex
refs.bib		refs.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art

Abstract

About

Releases

Packages

Languages

License

dimits-ts/Large-Text-NLP-Survey

Folders and files

Latest commit

History

Repository files navigation

Neural Natural Language Processing for Long Texts: A Survey of the State-of-the-Art

Abstract

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages