Multi-Turn Response Selection in Retrieval-Based Chatbots

Multi-turn response selection in retrieval-based chatbots is a task which aims to select the best-matched response from a set of candidates, given the context of a conversation. This task is attracting more and more attention in academia and industry. However, no one has maintained a leaderboard and a collection of popular papers and datasets yet. The main objective of this repository is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art studies on this task, which serves as a stepping stone for further research.

Datasets

Ubuntu Dialogue Corpus V1

Lowe et al. (2015) released the original version. [Download]
Xu et al. (2016) released a version in which numbers, paths and URLs were replaced by placeholders. [Download]
Gu et al. (2019) released a version based on Xu et al. (2016) in which contexts and responses are assigned ids to track them. [Download]

Ubuntu Dialogue Corpus V2

Lowe et al. (2017) released the original version. [Download]
Gu et al. (2019) released a version in which contexts and responses are assigned ids to track them. [Download]

Douban Conversation Corpus

Wu et al. (2017) released the original version. [Download]
Gu et al. (2019) released a version in which contexts and responses are assigned ids to track them. [Download]

E-commerce Corpus

Zhang et al. (2018) released the original version. [Download]
Gu et al. (2019) released a version in which contexts and responses are assigned ids to track them. [Download]

Leaderboard

Ubuntu Dialogue Corpus V1

Model	R_2@1	R_10@1	R_10@2	R_10@5	Paper and Code
BERT-FP (Han et al., 2021)	-	0.911	0.962	0.994	Fine-grained Post-training for Improving Retrieval-based Dialogue Systems. NAACL 2021. [paper] [code]
BERT-SL (Xu et al., 2021)	0.975	0.884	0.946	0.990	Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues. AAAI 2021. [paper]
UMS_BERT+ (Whang et al., 2021)	-	0.875	0.942	0.988	Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection. AAAI 2021. [paper] [code]
BERT-SPIDER (Zhang et al., 2021)	-	0.869	0.938	0.987	Structural Pre-training for Dialogue Comprehension. ACL 2021. [paper] [code]
SA-BERT+HCL (Su et al., 2021)	0.977	0.867	0.940	0.992	Dialogue Response Selection with Hierarchical Curriculum Learning. ACL 2021. [paper] [code]
DCM (Li et al., 2020)	-	0.868	0.936	0.987	Deep context modeling for multi-turn response selection in dialogue systems. Information Processing & Management 2020. [paper] [code]
SA-BERT (Gu et al., 2020)	0.965	0.855	0.928	0.983	Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2020. [paper] [code]
BERT-VFT (Whang et al., 2019)	-	0.855	0.928	0.985	An Effective Domain Adaptive Post-Training Method for BERT in Response Selection. INTERSPEECH 2020. [paper] [code]
RoBERTa-BASE-SS-DA (Lu et al., 2020)	0.955	0.826	0.909	0.978	Improving Contextual Language Models for Response Retrieval in Multi-Turn Conversation. SIGIR 2020. [paper] [code]
TADAM (Xu et al., 2020)	-	0.821	0.906	0.978	Topic-Aware Multi-turn Dialogue Modeling. AAAI 2021. [paper] [code]
G-MSN (Lin et al., 2020)	0.958	0.812	0.911	0.987	The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection. EMNLP 2020. [paper]
MSN (Yuan et al., 2019)	-	0.800	0.899	0.978	Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots. EMNLP 2019. [paper] [code]
IOI (Tao et al., 2019)	0.947	0.796	0.894	0.974	One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. ACL 2019. [paper] [code]
IMN (Gu et al., 2019)	0.946	0.794	0.889	0.974	Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2019. [paper] [code]
U2U-IMN (Gu et al., 2019)	0.945	0.790	0.886	0.973	Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. TASLP 2019. [paper] [code]
MRFN (Tao et al., 2019)	0.945	0.786	0.886	0.976	Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. WSDM 2019. [paper] [code]
IACMN (Wang et al., 2019)	0.944	0.782	0.886	0.973	Multi-Turn Response Selection in Retrieval-Based Chatbots with Iterated Attentive Convolution Matching Network. CIKM 2019. [paper] [code]
DAM (Zhou et al., 2018)	0.938	0.767	0.874	0.969	Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. ACL 2018. [paper] [code]
DUA (Zhang et al., 2018)	-	0.752	0.868	0.962	Modeling Multi-Turn Conversation with Deep Utterance Aggregation. COLING 2018. [code]
SMN (Wu et al., 2017)	0.926	0.726	0.847	0.961	Sequential Matching Network: A New Architecture for Multi-Turn Response Selection in Retrieval-Based Chatbots. ACL 2017. [paper] [code]

Ubuntu Dialogue Corpus V2

Model	R_2@1	R_10@1	R_10@2	R_10@5	Paper and Code
Cross-encoder (Humeau et al., 2020)	-	0.865	-	0.991	Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring. ICLR 2020. [paper] [code]
Thread-bi (Jia et al., 2020)	-	0.838	0.924	0.985	Multi-turn Response Selection using Dialogue Dependency Relations. EMNLP 2020. [paper] [code]
SA-BERT (Gu et al., 2020)	0.963	0.830	0.919	0.985	Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2020. [paper] [code]
IMN (Gu et al., 2019)	0.945	0.771	0.886	0.979	Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2019. [paper] [code]
U2U-IMN (Gu et al., 2019)	0.943	0.762	0.877	0.975	Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. TASLP 2019. [paper] [code]
HRDE-LTC (Yoon et al., 2018)	0.915	0.652	0.815	0.966	Learning to Rank Question-Answer Pairs using Hierarchical Recurrent Encoder with Latent Topic Clustering. NAACL 2018. [paper] [code]

Douban Conversation Corpus

Model	MAP	MRR	P@1	R_10@1	R_10@2	R_10@5	Paper and Code
BERT-FP (Han et al., 2021)	0.644	0.680	0.512	0.324	0.542	0.870	Fine-grained Post-training for Improving Retrieval-based Dialogue Systems. NAACL 2021. [paper] [code]
SA-BERT+HCL (Su et al., 2021)	0.639	0.681	0.514	0.330	0.531	0.858	Dialogue Response Selection with Hierarchical Curriculum Learning. ACL 2021. [paper] [code]
UMS_BERT+ (Whang et al., 2020)	0.625	0.664	0.499	0.318	0.482	0.858	Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection. AAAI 2021. [paper] [code]
SA-BERT (Gu et al., 2020)	0.619	0.659	0.496	0.313	0.481	0.847	Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2020. [paper] [code]
DCM (Li et al., 2020)	0.611	0.649	-	0.294	0.498	0.842	Deep context modeling for multi-turn response selection in dialogue systems. Information Processing & Management 2020. [paper] [code]
BERT-SPIDER (Zhang et al., 2021)	0.609	0.650	0.475	0.296	0.488	0.836	Structural Pre-training for Dialogue Comprehension. ACL 2021. [paper] [code]
RoBERTa-BASE-SS-DA (Lu et al., 2020)	0.602	0.646	0.460	0.280	0.495	0.847	Improving Contextual Language Models for Response Retrieval in Multi-Turn Conversation. SIGIR 2020. [paper] [code]
G-MSN (Lin et al., 2020)	0.599	0.645	0.476	0.308	0.468	0.826	The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection. EMNLP 2020. [paper]
TADAM (Xu et al., 2020)	0.594	0.633	0.453	0.282	0.472	0.828	Topic-Aware Multi-turn Dialogue Modeling. AAAI 2021. [paper] [code]
MSN (Yuan et al., 2019)	0.587	0.632	0.470	0.295	0.452	0.788	Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots. EMNLP 2019. [paper] [code]
IOI (Tao et al., 2019)	0.573	0.621	0.444	0.269	0.451	0.786	One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. ACL 2019. [paper] [code]
IACMN (Wang et al., 2019)	0.571	0.621	0.448	0.269	0.453	0.783	Multi-Turn Response Selection in Retrieval-Based Chatbots with Iterated Attentive Convolution Matching Network. CIKM 2019. [paper] [code]
MRFN (Tao et al., 2019)	0.571	0.617	0.448	0.276	0.435	0.783	Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. WSDM 2019. [paper] [code]
IMN (Gu et al., 2019)	0.570	0.615	0.433	0.262	0.452	0.789	Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2019. [paper] [code]
U2U-IMN (Gu et al., 2019)	0.564	0.611	0.429	0.259	0.430	0.791	Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. TASLP 2019. [paper] [code]
DAM (Zhou et al., 2018)	0.550	0.601	0.427	0.254	0.410	0.757	Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. ACL 2018. [paper] [code]
DUA (Zhang et al., 2018)	0.551	0.599	0.421	0.243	0.421	0.780	Modeling Multi-Turn Conversation with Deep Utterance Aggregation. COLING 2018. [code]
SMN (Wu et al., 2017)	0.529	0.569	0.397	0.233	0.396	0.724	Sequential Matching Network: A New Architecture for Multi-Turn Response Selection in Retrieval-Based Chatbots. ACL 2017. [paper] [code]

E-commerce Corpus

Model	R_10@1	R_10@2	R_10@5	Paper and Code
BERT-FP (Han et al., 2021)	0.870	0.956	0.993	Fine-grained Post-training for Improving Retrieval-based Dialogue Systems. NAACL 2021. [paper] [code]
BERT-SL (Xu et al., 2020)	0.776	0.919	0.991	Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues. AAAI 2021. [paper]
UMS_BERT+ (Whang et al., 2020)	0.762	0.905	0.986	Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection. AAAI 2021. [paper] [code]
SA-BERT+HCL (Su et al., 2021)	0.721	0.896	0.993	Dialogue Response Selection with Hierarchical Curriculum Learning. ACL 2021. [paper] [code]
BERT-SPIDER (Zhang et al., 2021)	0.708	0.853	0.986	Structural Pre-training for Dialogue Comprehension. ACL 2021. [paper] [code]
SA-BERT (Gu et al., 2020)	0.704	0.879	0.985	Speaker-Aware BERT for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2020. [paper] [code]
DCM (Li et al., 2020)	0.685	0.864	0.982	Deep context modeling for multi-turn response selection in dialogue systems. Information Processing & Management 2020. [paper] [code]
TADAM (Xu et al., 2020)	0.660	0.834	0.975	Topic-Aware Multi-turn Dialogue Modeling. AAAI 2021. [paper] [code]
RoBERTa-BASE-SS-DA (Lu et al., 2020)	0.627	0.835	0.980	Improving Contextual Language Models for Response Retrieval in Multi-Turn Conversation. SIGIR 2020. [paper] [code]
IMN (Gu et al., 2019)	0.621	0.797	0.964	Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. CIKM 2019. [paper] [code]
U2U-IMN (Gu et al., 2019)	0.616	0.806	0.966	Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. TASLP 2019. [paper] [code]
G-MSN (Lin et al., 2020)	0.613	0.786	0.964	The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection. EMNLP 2020. [paper]
MSN (Yuan et al., 2019)	0.606	0.770	0.937	Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots. EMNLP 2019. [paper] [code]
IOI (Tao et al., 2019)	0.563	0.768	0.950	One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. ACL 2019. [paper] [code]
DUA (Zhang et al., 2018)	0.501	0.700	0.921	Modeling Multi-Turn Conversation with Deep Utterance Aggregation. COLING 2018. [code]
SMN (Wu et al., 2017)	0.453	0.654	0.886	Sequential Matching Network: A New Architecture for Multi-Turn Response Selection in Retrieval-Based Chatbots. ACL 2017. [paper] [code]

Papers

In addition to the studies mentioned above, there are stil a lot of great studies on multi-turn response selection worth reading. We list a part of them below.

Distilling Knowledge for Fast Retrieval-based Chat-bots. Amir Vakili Tahami, Kamyar Ghajar, Azadeh Shakery. SIGIR 2020.
Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection. Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan. SIGIR 2020.
Conversational Word Embedding for Retrieval-Based Dialog System. Wentao Ma, Yiming Cui, Ting Liu, Dong Wang, ShijinWang, Guoping Hu. ACL 2020.
IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems. Liu Yang, Minghui Qiu, Chen Qu, Cen Chen, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Haiqing Chen. WWW 2020.
Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots. Jia-Chen Gu, Zhen-Hua Ling, Xiaodan Zhu, Quan Liu. EMNLP 2019.
Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems. Jia Li, Chongyang Tao, wei wu, Yansong Feng, Dongyan Zhao, Rui Yan. EMNLP 2019.
Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems. Jiazhan Feng, Chongyang Tao, Wei Wu, Yansong Feng, Dongyan Zhao, Rui Yan. ACL 2019.
Training Neural Response Selection for Task-Oriented Dialogue Systems. Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su. ACL 2019.
DSTC7 Task 1: Noetic End-to-End Response Selection. Chulaka Gunasekara, Jonathan K. Kummerfeld, Lazaros Polymenakos, Walter Lasecki. ACL 2019 Workshop.
A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots. Xueliang Zhao, Chongyang Tao, Wei Wu, Can Xu, Dongyan Zhao, Rui Yan. IJCAI 2019.
Sequential Attention-based Network for Noetic End-to-End Response Selection. Qian Chen, Wen Wang. AAAI 2019 Workshop on DSTC 7.
Building Sequential Inference Models for End-to-End Response Selection. Jia-Chen Gu, Zhen-Hua Ling, Yuping Ruan, Quan Liu. AAAI 2019 Workshop on DSTC 7.
Training Millions of Personalized Dialogue Agents. Pierre-Emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes. EMNLP 2018.
Personalizing Dialogue Agents: I have a dog, do you have pets too?. Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, Jason Weston. ACL 2018.
Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots. Yu Wu, Wei Wu, Zhoujun Li, Ming Zhou. ACL 2018.
Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems. Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Jun Huang, Haiqing Chen. SIGIR 2018.
Improving Response Selection in Multi-turn Dialogue Systems by Incorporating Domain Knowledge. Debanjan Chaudhuri, Agustinus Kristiadi, Jens Lehmann, Asja Fischer. CONLL 2018.
Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus. Jianxiong Dong, Jim Huang. ArXiv.
Multi-view Response Selection for Human-Computer Conversation. Xiangyang Zhou, Daxiang Dong, Hua Wu, Shiqi Zhao, Dianhai Yu, Hao Tian, Xuan Liu, Rui Yan. EMNLP 2016.
Learning to Respond with Deep Neural Networks for Retrieval-Based Human-Computer Conversation System. Rui Yan, Yiping Song, Hua Wu. SIGIR 2016.

Update

Although we work very hard to list more work, the studies we select to present in this repository are by no means complete. To this end, we welcome more people to participate in the maintenance of this project. Please feel free to open issues, pull requests or contact us (gujc@mail.ustc.edu.cn).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Multi-Turn Response Selection in Retrieval-Based Chatbots

Datasets

Ubuntu Dialogue Corpus V1

Ubuntu Dialogue Corpus V2

Douban Conversation Corpus

E-commerce Corpus

Leaderboard

Ubuntu Dialogue Corpus V1

Ubuntu Dialogue Corpus V2

Douban Conversation Corpus

E-commerce Corpus

Papers

Update

Files

README.md

Latest commit

History

README.md

File metadata and controls

Multi-Turn Response Selection in Retrieval-Based Chatbots

Datasets

Ubuntu Dialogue Corpus V1

Ubuntu Dialogue Corpus V2

Douban Conversation Corpus

E-commerce Corpus

Leaderboard

Ubuntu Dialogue Corpus V1

Ubuntu Dialogue Corpus V2

Douban Conversation Corpus

E-commerce Corpus

Papers

Update