Skip to content

Releases: KathyReid/cvaccents

0.3 - Updated with Kiswahili example for EAAMO

11 May 07:54
e084eff
Compare
Choose a tag to compare

This release applies the toolset to the Mozilla Common Voice v13 Kiswahili dataset, as an example of applying to a language other than English.

0.2 - Updated with v13 Mozilla Common Voice data

28 Mar 10:42
4abd602
Compare
Choose a tag to compare

The key changes in this version are:

  • The number of categories identified in the data have increased from 16 in the first version, to 20 in this one. The four additional categories are:

    • Linguistic heritage of speaker - indicating the speaker's language acquisition or immersion heritage, such as time spent in a location, or being born or raised in a location.
    • Socio-economic marker - indicating a speaker's association with a socio-economic group or class, such as Middle Class.
    • Hybrid dialect - indicating the speaker speaks using a dialect where two languages have come into contact - such as Denglish (German - Deutsch - and English) and Hinglish (Hindi and English, spoken in India).
    • Generational marker - indicating the speaker's association with a generation, belying their age range, such as Gen Z.
  • The number of individual accents identified has increased from 164 in the first version, to 235 in this one.

  • The number of relationships between individual accents, which indicate a co-occurrence between speaker-described accents, such as "German" and "England English", has increased from 297 in the first version, to 515 in this one.

0.1 - Initial release - FAccT 2023

17 Mar 02:24
71412bd
Compare
Choose a tag to compare

This release versions this repository at the state that was submitted to the FAccT 2023 conference - please see this preprint.

  • Only does analysis of en language
  • Using v11 of the Common Voice dataset

Full Changelog: https://github.com/KathyReid/cvaccents/commits/0.1