Skip to content

jo-valer/machine-translation-ladin-fascian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Translation for Fassa Ladin

Overview

This repo contains my final project for Applied Natural Language Processing (University of Trento, a.y. 2023/24).

Introduction

I construct the first Fassa Ladin-Italian-English parallel corpus, and train a machine translation model on it. More information can be found in the accompanying report.

🛠 English → Ladin demo

You can try translating text from English/Italian to Fassa Ladin using the model on Hugging Face Spaces 🦀

Data

The dataset draws from multiple resources in 5 different domains: literature, news, games, laws, and brochures. It is available in the data directory, either as a single file or split into train, validation, in-domain test, and out-of-domain test sets.

Experiments

Preliminary Experiments

Open In Colab

Evaluate the performance of the pre-trained models.

Finetuning

Open In Colab

Fine-tune the pre-trained models on the Fassa Ladin-Italian-English parallel corpus, with the two approaches: Pivot-based transfer learning and Multilingual translation.

Evaluation

Open In Colab

Evaluate the models' performance, investigate Transfer learning across domains, and Forgetting of previous knowledge.

About

Final project for ANLP 2023/24, Giovanni Valer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published