Tü-CL at SIGMORPHON 2023: Straight-Through Gradient Estimation for Hard Attention

Introduction

This repository contains code accompanying our system description paper Tü-CL at SIGMORPHON 2023: Straight-Through Gradient Estimation for Hard Attention for the Interlinear Glossing shared task. Note, that the code for the inflection model can be found in this repository.

The implementation of the joint segmentation and glossing model is in morpheme_model.py. The segmentation model is in morpheme_segmenter.py.

Setup

Create a virtual environment, e.g. by using Anaconda:

conda create -n glossing python=3.9 pip

Activate the environment:

conda activate glossing

Then, install the dependencies in requirements.txt:

pip install -r requirements.txt

Finally, place the shared task data in the repository, i.e. there should be a folder called data, which can be obtained from the shared task's main repository.

Train a model

To train a single model and get predictions for the corresponding test set, run

python main.py --language LANGUAGE --model MODEL --track TRACK

Only languages in the shared task dataset are supported. MODEL can be ctc for the BiLSTM with CTC loss baseline, or morph for the joint segmentation and glossing model. TRACK can be 1 for the closed track and 2 for the open track. For further hyperparameters, run

python main.py --help

Hyperparameter tuning

To obtain best hyperparameters, you can use the script hyperparameter_tuning.py:

python hyperparameter_tuning.py \
    --language LANGUAGE \
    --model MODEL \
    --track TRACK \
    --trials TRIALS

Trials specifies the number of evaluated hyperparameter combinations. We used 50 for obtaining the hyperparameters provided in the file best_hyperparameters.json, which is included in this repository.

To retrain all models with our best hyperparameters, run

python retrain_best_hyperparameters.py

And to obtain predictions for the test data from the trained models, run

python predict_from_model.py

Citation

If you use this code, consider citing our paper:

@inproceedings{girrbach-2023-tu,
    title = {T{\"u}-{CL} at {SIGMORPHON} 2023: Straight-Through Gradient Estimation for Hard Attention},
    author = "Girrbach, Leander",
    booktitle = "Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.sigmorphon-1.17",
    pages = "151--165",
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.gitignore		.gitignore
README.md		README.md
best_hyperparameters.json		best_hyperparameters.json
bilstm.py		bilstm.py
containers.py		containers.py
ctc_model.py		ctc_model.py
data.py		data.py
experiment.py		experiment.py
hyperparameter_tuning.py		hyperparameter_tuning.py
main.py		main.py
morpheme_model.py		morpheme_model.py
morpheme_segmenter.py		morpheme_segmenter.py
parse_hyperparameters.py		parse_hyperparameters.py
predict_from_model.py		predict_from_model.py
requirements.txt		requirements.txt
retrain_best_hyperparameters.py		retrain_best_hyperparameters.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tü-CL at SIGMORPHON 2023: Straight-Through Gradient Estimation for Hard Attention

Introduction

Setup

Train a model

Hyperparameter tuning

Citation

About

Releases

Packages

Languages

LGirrbach/sigmorphon-2023-glossing

Folders and files

Latest commit

History

Repository files navigation

Tü-CL at SIGMORPHON 2023: Straight-Through Gradient Estimation for Hard Attention

Introduction

Setup

Train a model

Hyperparameter tuning

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages