If you use these data please cite
- the original source
Haspelmath, Martin & Tadmor, Uri (eds.) 2009. World Loanword Database. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wold.clld.org)
- the derived dataset using the DOI of the particular released version you were using
This dataset is licensed under a CC-BY-4.0 license
Available online at http://wold.clld.org
Conceptlists in Concepticon:
The World Loanword Database, edited by Martin Haspelmath and Uri Tadmor, is a scientific publication by the Max Planck Institute for Evolutionary Anthropology, Leipzig (2009).
It provides vocabularies (mini-dictionaries of about 1000-2000 entries) of 41 languages from around the world, with comprehensive information about the loanword status of each word. It allows users to find loanwords, source words and donor languages in each of the 41 languages, but also makes it easy to compare loanwords across languages.
Each vocabulary was contributed by an expert on the language and its history. An accompanying book has been published by De Gruyter Mouton (Loanwords in the World's Languages: A Comparative Handbook, edited by Martin Haspelmath & Uri Tadmor).
The World Loanword Database consists of vocabularies contributed by 41 different authors or author teams. When citing material from the database, please cite the corresponding vocabulary (or vocabularies).
The World Loanword Database is the result of a collaborative project coordinated by Uri Tadmor and Martin Haspelmath between 2004 and 2008, called the Loanword Typology Project (LWT). Most of the contributors took part in workshops at which the procedures for selecting and annotating words were discussed extensively. The list of 1460 meanings on which the vocabularies are based is called the Loanword Typology meaning list, and it is in turn based on the list of the Intercontinental Dictionary Series.
- Varieties: 41 (linked to 41 different Glottocodes)
- Concepts: 1,814 (linked to 1,458 different Concepticon concept sets)
- Lexemes: 64,289
- Sources: 41
- Synonymy: 1.20
- Invalid lexemes: 0
- Tokens: 365,462
- Segments: 631 (0 BIPA errors, 0 CLTS sound class errors, 626 CLTS modified)
- Inventory size (avg): 54.68
Name | GitHub user | Description | Role |
---|---|---|---|
Tiago Tresoldi | @tresoldi | patron, maintainer, orthographic profiles | Other |
Robert Forkel | @xrotwang | code | Editor |
Johann-Mattis List | @LinguList | code, profile | Editor |
Natalia Morozova | @natalia-morozova | orthographic profiles | Other |
Martin Haspelmath | publication editor | Author | |
Uri Tadmor | publication editor | Author |
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json