diff --git a/data/README.md b/data/README.md new file mode 100644 index 0000000..3089342 --- /dev/null +++ b/data/README.md @@ -0,0 +1,15 @@ +## Data + +### MovieLens + +We use the famous [MovieLens 100K dataset](https://grouplens.org/datasets/movielens/100k/) as a concrete application. + +The training and testing data are stored in `data_train.csv` and `data_test.csv`, respectively. In the train/test data, the response is considered 1 if the user rated an item (movie) as 5 on a 1-5 point scale and 0 otherwise. + +Furthermore, we also have features for both users and items in `features_user.csv` and `features_item.csv` respectively. + +In the **extended** folder we include additional datasets relevant to more advanced usage. For more details on the datasets above, including advanced usage, see the [Data Overview](https://github.com/fidelity/mab2rec/blob/main/notebooks/1_data_overview.ipynb) notebook + +### Article Recommendation + +If you are interested to explore a larger dataset based on article recommendation as used in [[KDD 2023] Verma, Ghanshyam, et al. "Empowering recommender systems using automatically generated Knowledge Graphs and Reinforcement Learning."](https://arxiv.org/abs/2307.04996), you can [download](https://github.com/fidelity/mab2rec/releases/download/1.2.1/data.zip) it and try out different Mab2Rec algorithms.