Multilingual BERT for Textual Entailment in Vietnamese and English

Introduction

This model is designed to address the Natural Language Inference (NLI) task using a pre-trained multilingual model. The goal is to predict the semantic relationship between two sentences (synonymous, antonymous, or semantically unrelated).

Requirements

Python 3.6+
TensorFlow 2.0+
Transformers library
Sentencepiece library

Dataset

The datasets contain two columns for sentence pairs (sentence_1, sentence_2) and one column for the label (label). sentence_1 can be an English or Vietnamese sentence, sentence_2 is a Vietnamese sentence, and label has three values: "agree", "disagree", and "neutral".

Model

I use the infoXLM-Large model from Hugging Face. The model is fine-tuned for the NLI task using TensorFlow Keras with additional layers for classification.

Experiments

To find the most suitable model for the task, I experimented with various multilingual models: mBERT, XLM-R, and infoXLM with their variations. The table below shows the experimental results with these models.

Model	Train acc	Val acc	Test acc
mBERT	0.95	0.86	0.85
XLM-RoBERTa-base	0.96	0.9	0.9
XLM-RoBERTa-large	0.99	0.95	0.96
InfoXLM-base	0.85	0.88	0.87
InfoXLM-large	0.99	0.96	0.96

The two models, XLM-R-large and InfoXLM-large, yielded nearly identical results. Ultimately, I chose the InfoXLM-large model to address this task.

Acknowledgements

Thank you to the authors of the mBERT, XLM-R, and infoXLM models.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
README.md		README.md
infoXLM-roberta-large.ipynb		infoXLM-roberta-large.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual BERT for Textual Entailment in Vietnamese and English

About

Releases

Packages

Languages

sy-nguyen/MBTE-ViEn

Folders and files

Latest commit

History

Repository files navigation

Multilingual BERT for Textual Entailment in Vietnamese and English

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages