Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
This is the official CERBERUS model code repository for our long paper in Findings of EMNLP 2022, "Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems".
[Paper] [Amazon Science] [Preprint]
@inproceedings{matsubara2022ensemble,
title={{Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems}},
author={Matsubara, Yoshitomo and Soldaini, Luca and Lind, Eric and Moschitti, Alessandro},
booktitle={Findings of the Association for Computational Linguistics: EMCNLP 2022},
pages={7259--7272},
year={2022}
}
Our CERBERUS implementation is based on transformers.ElectraForSequenceClassification and tested under the following conditions:
- Python 3.6 - 3.7
- torch==1.6.0
- transformers==3.0.2
This CERBERUS model consists of 11 shared encoder body layers and 3 ranking heads of 1 head layer each learned from 3 teacher AS2 models: ALBERT-XXLarge, ELECTRA-Large, and RoBERTa-Large fine-tuned on the ASNQ dataset.
Download and unzip cerberus11-3_albert_electra_roberta_asnq.zip
from transformers import AutoTokenizer
from cerberus import CerberusModel
tokenizer = AutoTokenizer.from_pretrained('google/electra-base-discriminator')
start_ckpt_file_path = './cerberus11-3_albert_electra_roberta_asnq/cerberus_model.pt'
model = CerberusModel(None, 11, start_ckpt_file_path=start_ckpt_file_path)
model.eval()
input_dict = tokenizer([('question', 'answer sentence')],
return_tensors='pt',
max_length=128,
truncation=True)
output = model(**input_dict)
This CERBERUS model consists of 11 shared encoder body layers and 3 ranking heads of 1 head layer each learned from 3 teacher AS2 models: ALBERT-XXLarge, ELECTRA-Large, and RoBERTa-Large fine-tuned on the ASNQ dataset and then on the WikiQA dataset.
Download and unzip cerberus11-3_albert_electra_roberta_asnq_wikiqa.zip and asnq-electra-base-discriminator.
from transformers import AutoTokenizer
from cerberus import CerberusModel
asnq_ckpt_dir_path = './asnq-electra-base-discriminator'
tokenizer = AutoTokenizer.from_pretrained(asnq_ckpt_dir_path)
head_configs = [
{'model': {'pretrained_model_name_or_path': asnq_ckpt_dir_path},
'base_model': 'electra', 'classifier': 'classifier'},
{'model': {'pretrained_model_name_or_path': asnq_ckpt_dir_path},
'base_model': 'electra', 'classifier': 'classifier'},
{'model': {'pretrained_model_name_or_path': asnq_ckpt_dir_path},
'base_model': 'electra', 'classifier': 'classifier'}
]
start_ckpt_file_path = './cerberus11-3_albert_electra_roberta_asnq_wikiqa/cerberus_model.pt'
model = CerberusModel(head_configs, 11, start_ckpt_file_path=start_ckpt_file_path)
model.eval()
input_dict = tokenizer([('question', 'answer sentence')],
return_tensors='pt',
max_length=128,
truncation=True)
output = model(**input_dict)
See CONTRIBUTING for more information.
This library is licensed under the CC-BY-NC-4.0 License.