For compute TER: pip install python-Levenshtein
apply_bpe.sh : tokenise english corpus and apply BPE to both english and tamil corpuses. preprocess.sh : binarise datasets for fairseq train.sh : run training generate.sh : generate translation result and compute BLEU score compute_ter.py: generate translation and compute TER score
Examples of translation can be found in generation_results/EP/EP and generation_results/EP/WMT