Skip to content

parameshkrishnaa/UD_Telugu-MTG

 
 

Repository files navigation

# Summary

The Telugu UD treebank is created in UD based on manual annotations of sentences from a grammar book.

# Introduction

The Telugu UD treebank consists of 1328 sentences (6465 tokens) and its domain is grammar book examples from Modern Telugu Grammar (Krishnamurti and Gwynn 1985). The treebank is licensed under the terms of CC BY-NC-SA 3.0. The sentences are manually annotated following UD guidelines.

# Acknowledgments

Taraka Rama (University of Oslo, Norway) and Sowmya Vajjala (Iowa State University, USA) manually annotated the sentences. Çağrı Çöltekin (University of Tuebingen, Germany) helped with setting up and hosting the server for annotation interface. Dan Zeman (Charles University, Czech Republic) did the Roman transliteration.

## References
* Bhadriraju Krishnamurti and J. P. L. Gwynn. 1985. A Grammar of Modern Telugu. Oxford: Oxford University Press. xxii+459pp.

# Domains and Data Split


# Basic Stats:
Tree count:  1328
Word count:  6465
Token count: 6465
Dep. relations: 42 of which 11 language specific
POS tags: 14

# Changelog

* 2018-04-15 v2.2
  * Repository renamed from UD_Telugu to UD_Telugu-MTG.

=== Machine-readable metadata (DO NOT REMOVE!) ================================
Data available since: UD v2.1
Includes text: yes
License: CC BY-SA 4.0
Genre: grammar-examples
Lemmas: manual native
UPOS: manual native
XPOS: not available
Features: not available
Relations: manual native
Contributors: Rama, Taraka; Vajjala, Sowmya
Contributing: here
Contact: tarakark@ifi.uio.no, sowmya@iastate.edu
===============================================================================