Skip to content

GlaucoLorenzut/low-resource-machine-translation

Repository files navigation

Low resource machine translation

Requirements

pytorch sentencepiece

sacremoses fairseq

For compute TER: pip install python-Levenshtein

Description

apply_bpe.sh : tokenise english corpus and apply BPE to both english and tamil corpuses. preprocess.sh : binarise datasets for fairseq train.sh : run training generate.sh : generate translation result and compute BLEU score compute_ter.py: generate translation and compute TER score

Examples of translation can be found in generation_results/EP/EP and generation_results/EP/WMT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •