Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation (BART4Rec) (WWW2020)
Note that Grec shares high network similarity with BART but was published earlier than BART on Arxiv.
@inproceedings{yuan2020future,
title={Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation},
author={Yuan, Fajie and He, Xiangnan and Jiang, Haochuan and Guo, Guibing and Xiong, Jian and Xu, Zhezhao and Xiong, Yilin},
booktitle={Proceedings of The Web Conference 2020},
pages={303--313},
year={2020}
}
@article{lewis2019bart,
title={Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension},
author={Lewis, Mike and Liu, Yinhan and Goyal, Naman and Ghazvininejad, Marjan and Mohamed, Abdelrahman and Levy, Omer and Stoyanov, Ves and Zettlemoyer, Luke},
journal={arXiv preprint arXiv:1910.13461},
year={2019}
}
NextItNet pytorch version: https://github.com/syiswell/NextItNet-Pytorch
GRec pytorch version: https://github.com/hangjunguo/GRec
Session-based recommender systems have attracted much attention recently. To capture the sequential dependencies, existing methods resort either to data augmentation techniques or left-to-right style autoregressive training. Since these methods are aimed to model the sequential nature of user behaviors, they ignore the future data of a target interaction when constructing the prediction model for it. However, we argue that the future interactions after a target interaction, which are also available during training, provide valuable signal on user preference and can be used to enhance the recommendation quality.
Properly integrating future data into model training, however, is non-trivial to achieve, since it disobeys machine learning principles and can easily cause data leakage. To this end, we propose a new encoder-decoder framework named Gap-filling based Recommender (GRec), which trains the encoder and decoder by a gap-filling mechanism. Specifically, the encoder takes a partially-complete session sequence (where some items are masked by purpose) as input, and the decoder predicts these masked items conditioned on the encoded representation. We instantiate the general GRec framework using convolutional neural network with sparse kernels, giving consideration to both accuracy and efficiency. We conduct experiments on two real-world datasets covering short-, medium-, and longrange user sessions, showing that GRec significantly outperforms the state-of-the-art sequential recommendation methods. More empirical studies verify the high utility of modeling future contexts under our GRec framework.
or python GRec_NCE.py (NCE sampling. Be careful with the number of negative examples.)
You can also replace our sampling method and CE loss with other well-known negative sampler or loss functions, such as LambdaFM.