Skip to content

Xiaopengli1/Scenario-Wise-Rec

Repository files navigation

License Open In Colab Downloads

1. Introduction

Scenario-Wise Rec, an open-sourced benchmark for multi-scenario/multi-domain recommendation.

structures

Dataset introduction
Dataset Domain number # Interaction # User # Item
MovieLens Domain 0 210,747 1,325 3,429
Domain 1 395,556 2,096 3,508
Domain 2 393,906 2,619 3,595
KuaiRand Domain 0 2,407,352 961 1,596,491
Domain 1 7,760,237 991 2,741,383
Domain 2 895,385 171 332,210
Domain 3 402,366 832 547,908
Domain 4 183,403 832 43,106
Ali-CCP Domain 0 32,236,951 89,283 465,870
Domain 1 639,897 2,561 188,610
Domain 2 52,439,671 150,471 467,122
Amazon Domain 0 198,502 22,363 12,101
Domain 1 278,677 39,387 23,033
Domain 2 346,355 38,609 18,534
Douban Domain 0 227,251 2,212 95,872
Domain 1 179,847 1,820 79,878
Domain 2 1,278,401 2,712 34,893
Mind Domain 0 26,057,579 737,687 8,086
Domain 1 11,206,494 678,268 1,797
Domain 2 10,237,589 696,918 8,284
Domain 3 9,226,382 656,970 1,804
Model introduction
Model model_name Link
Shared Bottom sharedbottom Link
MMOE mmoe Link
PLE ple Link
SAR-Net sarnet Link
STAR star Link
M2M m2m Link
AdaSparse adasparse Link
AdaptDHM adaptdhm Link
EPNet ppnet Link
PPNet epnet Link
HAMUR hamur Link
M3oE m3oe Link

2. Installation

WARNING: Our package is still being developed, feel free to post issues if there are any usage problems.

Install via GitHub (Recommended)

First, clone the repo:

git clone https://github.com/Xiaopengli1/Scenario-Wise-Rec.git

Then,

cd Scenario-Wise-Rec

Then use pip to install our package:

pip install .

3. Usage

We provide running scripts for users. See /scripts, dataset samples are provided in /scripts/data. You could directly test it by simply do (such as for Ali-CCP):

python run_ali_ccp_ctr_ranking_multi_domain.py --model [model_name]

For full-dataset download, refer to the following steps.

Step 1: Full Datasets Download

Four multi-scenario/multi-domain datasets are provided. See the following table.

Dataset Domain Number Users Items Interaction Download
Movie-Lens 3 6k 4k 1M ML_Download
KuaiRand 5 1k 4M 11M KR_Download
Ali-CCP 3 238k 467k 85M AC_Download
Amazon 3 85k 54k 823k AZ_Download
Douban 3 2k 210k 1.7M DB_Download
Mind 4 748k 20k 56M MD_Download

Substitute the full-dataset with the sampled dataset.

Step 2: Run the Code

python run_movielens_rank_multi_domain.py --dataset_path [path] --model_name [model_name] --device ["cpu"/"cuda:0"] --epoch [maximum epoch] --learning_rate [1e-3/1e-5] --batch_size [2048/4096] --seed [random seed] 

4. Tutorial

To facilitate a seamless experience, we have developed a comprehensive Colab tutorial that guides you through every essential step required to utilize this benchmark effectively. This tutorial is designed with user-friendliness in mind and covers the following key aspects:

  1. Package Installation
  2. Data Download
  3. Model/Data Loading
  4. Model Training
  5. Result Evaluation

Each section of the tutorial is designed to be self-contained and easy to follow, making it a valuable resource whether you are a beginner or an experienced user.

5. Build Your Own Multi-scenario Dataset/Model

We offer two template files run_example.py and base_example.py for a pipeline to help you to process different multi-scenario dataset and your own multi-scenario models.

Instructions on Processing Your Dataset

See run_example.py. The function get_example_dataset(input_path) is an example to process your dataset. Be noted the feature "domain_indicator" is the feature that indicates domains. For other implementation details, refer to the raws file.

Instructions on Building Your Model

See base_example.py. Where you could build your own multi-scenario model here. We left two spaces for users to implement scenario-shared and scenario-specific models. We also leave comments on how to process the final output. Please refer to the raws file to see more details.

6. Contributing

We welcome any contribution that could help improve the benchmark, and don't forget to star 🌟 our project!

7. Credits

The framework is referred to Torch-RecHub. Thanks to their contribution.