Skip to content

mdingemanse/text-to-talk

Repository files navigation

From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology

Mark DingemanseAndreas Liesenfeld

Abstract: Informal social interaction is the primordial home of human language. Linguistically diverse conversational corpora are an important and largely untapped resource for computational linguistics and language technology. Through the efforts of a worldwide language documentation movement, such corpora are increasingly becoming available. We show how interactional data from 63 languages (26 families) harbours insights about turn-taking, timing, sequential structure and social action, with implications for language technology, natural language understanding, and the design of conversational interfaces. Harnessing linguistically diverse conversational corpora will provide the empirical foundations for flexible, localizable, humane language technologies of the future.

Status: Accepted by ACL 2022.

@inproceedings{dingemanse-liesenfeld-2022-TextToTalk,
    title = "From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology",
    author = "Dingemanse, Mark and
      Liesenfeld, Andreas",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics",
    month = may,
    year = "2022",
    address = "Online",
    publisher = "Association for Computational Linguistics",
}

About

Repository for 'From text to talk' position paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published