LSTM encoder-decoder sequence-to-sequence models for Icelandic

This directory contains an LSTM encoder-decoder sequence-to-sequence models, trained for Icelandic g2p. The models were trained using the baseline for the Sigmorphon 2020 Shared task in multilingual g2p, with manually transcribed training data of ~5,800 words per pronunciation variant.

See code for training and evaluation: https://github.com/sigmorphon/2020/tree/master/task1

Reference paper: Gorman, Kyle et al. (2020): The SIGMORPHON 2020 Shared Task on Multilingual Grapheme-to-Phoneme Conversion. In: Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology (https://www.aclweb.org/anthology/2020.sigmorphon-1.2/)

Fairseq setup

With Conda:

conda is recommended for a reproducible environment. Once you have conda installed, create a new conda environment by running this:

conda env create -f environment.yml

The new environment is called "fairseq-lstm". Activate it by running this:

conda activate fairseq-lstm

Clone Fairseq and install, see: https://github.com/pytorch/fairseq

Trouble shooting & inquiries

This application is still in development. If you encounter any errors, feel free to open an issue inside the issue tracker. You can also contact us via email.

Contributing

You can contribute to this project by forking it, creating a private branch and opening a new pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
checkpoints		checkpoints
data-bin		data-bin
version_20.11		version_20.11
version_21.09		version_21.09
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
transcribe_ice_north		transcribe_ice_north
transcribe_ice_northeast		transcribe_ice_northeast
transcribe_ice_south		transcribe_ice_south
transcribe_ice_standard		transcribe_ice_standard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LSTM encoder-decoder sequence-to-sequence models for Icelandic

Fairseq setup

Trouble shooting & inquiries

Contributing

About

Releases

Packages

Contributors 3

Languages

License

grammatek/g2p-lstm

Folders and files

Latest commit

History

Repository files navigation

LSTM encoder-decoder sequence-to-sequence models for Icelandic

Fairseq setup

Trouble shooting & inquiries

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages