Skip to content

evrog/TextWorlds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TextWorlds

AUTHORS: Elena Mikhalkova, Timofei Protasov, Polina Gavin, Anastasiya Bashmakova, Anastasiya Drozdova, Alexander Zhmykhov Univesity of Tyumen, 2018-2023

This repository contains data and processing files of our project "Text Worlds".

About the project

We design annotation schemes and create a corpus of annotated literary narratives based on the theory of text worlds. The Text World Theory is developed in cognitive linguistics. A text world is a stretch of text characterized by the union of characters, time and space. Whenever there is a significant change in time and space, a new text world opens to hold all of its participants.

The theory opens a perspective to computationally study and model narrative structures.

Annotation schemes

This repository contains XML-files annotated with two annotation schemes.

Scheme 1 was employed in annotation of "The Gift of the Magi" (see files starting with "Magi"). This scheme included immediate annotation of text world elements: time, place and characters, and shifts in time and space. As the latter can happen not only on the lexical level, we added automatic morphological markup of verb tenses, performed with nltk (e.g. "There was VBD clearly nothing to do VB"). The guidelines for this scheme are in the file https://github.com/evrog/TextWorlds/blob/master/Annotation_guidelines_scheme_1.md in this repository.

Scheme 2 was used in the rest of the files (all in Russian). This scheme separates two levels of annotation: that of the world text elements (characters, space and time) and shifts (switches between the text worlds). The two levels were annotated separately, not necessariliy by the same annotator. The Russian guidelines to this scheme are given here:

  1. Annotation of elements: https://docs.google.com/document/d/e/2PACX-1vRUhO_ab4AFz1lSIuqf_ZifZWZQZyQJeNR2FHv720vUTgHkAhr1lmtrduMSK8KeJA/pub
  2. Annotation of shifts: https://docs.google.com/document/d/e/2PACX-1vS2HPn-MjWZhMp-wF8sMH4uyq4Jjo0DQ0-eecR92XlRWd3Ph495KpYDyV9t66Dk3g/pub

The following table demonstrates how many texts were annotated with elements and shifts with Scheme 2. Titles are abbreviated. See extended description of text sources below.

TITLE Elements Shifts
Потец 1 1
Попугай 1 5
Мирная 1 1
Лягушка 1 1
Лисичка 6 0
Лампа 2 0
Иванушка 2 2
Дочь 5 3
Волк 4 1
Умный 1 0
Они 1 0
Визит 1 0

Comparative tables

This link leads to a folder with csv-tables that demonstrate which tags were assigned to which tokens and which tokens have the same tags across a text: https://drive.google.com/drive/folders/1tCaNhIF5noiLyD-cBx7fWpem4PCx2SND?usp=sharing

Sources of texts in XML-files

We also have a collection of "We Can Remember It for You Wholesale" by PK Dick annotated by 6 annotators in Russian and 2 in English, but due to copyright issues we cannot share it publically.

Annotators

Elena Mikhalkova, Timofei Protasov, Polina Gavin, Anastasiya Bashmakova, Anastasiya Drozdova, Daria Bormova, Alexander Dmitrienko, Oleg Evseyev, Madina Alimtaeva, Anzhelika Gornnova, Anastasiya Kuzminykh

Our publications

  1. Mikhalkova, E., Protasov, T., Drozdova, A., Bashmakova, A., & Gavin, P. (2019). Towards Annotation of Text Worlds in a Literary Work. In Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (Компьютерная лингвистика и интеллектуальные технологии) (pp. 101-110).
  2. Mikhalkova, E., Protasov, T., Sokolova, P., Bashmakova, A., & Drozdova, A. (2020, May). Modelling Narrative Elements in a Short Story: A Study on Annotation Schemes and Guidelines. In Proceedings of the 12th Language Resources and Evaluation Conference (pp. 126-132).

Contacts

e.v.mikhalkova@utmn.ru

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages