This repository contains the linked IndoWordnet data with English Wordnet published at the Langauge Resources and Evaluation conference (LREC) in 2018. The paper is available here and here.
IndoWordnet can be accessed online via this URL
We acknowledge the lexicographers from CFILT lab who created this data by manually linking English and Hindi Wordnet synsets alogn with the engineers/researchers who enabled the data curation.
- Version 1.0: IWN-EN release with Assamese, Bodo, Kashmiri, Konkani, Manipuri, Marathi, Nepali, Oriya, and Sanskrit synsets. All IWN synset linkages are now present here (to English Wordnet).
- Version 0.5: IWN-EN release with Hindi, Bengali, Gujarati, Kannada, Malayalam, Punjabi, Tamil, Telugu and Urdu Wordnet synsets.
- Version 0.0.1: IWN-EN initial release with Hindi Wordnet and English Wordnet mapping.
The raw format dataset files can also be found on this Git repository under the data folder.
Diptesh Kanojia
Shivam Mhaskar
Diptesh Kanojia, Kevin Patel, and Pushpak Bhattacharyya. 2018. Indian Language Wordnets and their Linkages with Princeton WordNet. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
@inproceedings{kanojia-etal-2018-indian,
title = "{I}ndian {L}anguage {W}ordnets and their {L}inkages with {P}rinceton {W}ord{N}et",
author = "Kanojia, Diptesh and
Patel, Kevin and
Bhattacharyya, Pushpak",
booktitle = "Proceedings of the Eleventh International Conference on Language Resources and Evaluation ({LREC} 2018)",
month = may,
year = "2018",
address = "Miyazaki, Japan",
publisher = "European Language Resources Association (ELRA)",
url = "https://aclanthology.org/L18-1728",
}