NAACL 2024: Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-Lingual Self-Distillation
Welcome to the official implementation of the paper "Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-Lingual Self-Distillation" presented at NAACL 2024.
This repository demonstrates our proposed approach for addressing the language-level performance disparity issue in multilingual Pretrained Language Models (mPLMs). We introduce the ALSACE that comprises of Teacher Language Selection and Cross-lingual Self-Distillation to mitigate this issue.
With this code, you can:
- Train the mPLM with the ALSACE method
- Evaluate the language-level performance disparity of the mPLM
The requirements to use this code include Python 3.6+, PyTorch 1.0+, and other common packages listed in requirements.txt
.
For training the model, please follow the following instructions:(will be available soon).
To evaluate the language-level performance disparity of any mPLM, please refer to the following instruction and scripts:(will be available soon).
If you use our work, please cite our paper (BibTex citation will be available soon).
For any question or suggestion, you can submit an issue in this repository.
This is an official implementation yet it may not be free from bugs. Please use it responsibly.