A collaborative filtering recommender system that delivers customized recommendations for online-shopping goods. It learns from customers recent events (browsing, searching, rating) and updates recommendation in real time.
-
Download product metadata to
springbackend/src/main/resources/data
(see data format in the next section). Load the data into Elasticsearch usingProductLoader
. -
Download user rating data to
recommender/src/main/resources/data
(see data format in the next section). Configure MongoDB connection inapplication_properties
. Load the data into MongoDB usingDataLoader
. -
Run
ALSRecommender, ItemCFRecommender, TrendingRecommmender
to train offline models. -
Install all service modules in Frontend and Backend sections.
-
Build packages for Spring backend and online recommender.
cd springbackend && mvn clean install
cd recommender && mvn clean install
-
Execute the script to start services (FE, BE, online recommender)
sh run_local.sh
In this project the model is trained with part of the dataset Amazon Review Data by Stanford.
- User: 126k
- Product: 45k
- 5 categories: Automotive, Beauty, Office_Products, Software, Video_Games
{"asin":"B0000223J1","categories":[["Automotive","Tools & Equipment","Body Repair Tools","Buffing & Polishing Pads"]],"description":"This hook and loop polishing bonnet is an accessory for Makita polisher model 9227C.","imUrl":"http://ecx.images-amazon.com/images/I/81E3GK0PEKL.SX300.gif","price":14.4,"title":"Makita 743403-A Polishing Bonnet","ratingAvg":3.8333333333,"ratingCount":6}
- Rating: 742k
{"userId":"A108J5O7DG2WIM","asin":B002YEY7EU,"rating":5,"timestamp":1382054400}
- React
- Material-UI
- Spring Boot 2.3
- MongoDB 3.6
- Spark 2.2.0
- Kafka 2.4.1
- Redis 6.0.6
- Elasticsearch 7.8.1
Matrix factorization: factorize user-item rating matrix as the product of two lower dimensional metrics.
R = U * I, R: u×i, U: u×k, I: k×i (k: latent dimension)
Since the model requires large dataset to train, it is trained offline daily with Spark ML. For new users who do not appear on the dataset, recommend top trending products from result of the Trending Recommender.
This part of recommendation updates triggered by real-time customer events (e.g. browsing, rating, searching) on the website. Recommendation candidates are chosen from similar products of the event-related product.
For new users who do not appear on the dataset, recommend top rated products from result of the Trending Recommender.
Recommend based on item similarity computed from previous user behavior (item A and item B are similar because they are liked by the same group by user)
- Trending products: the top 20 products with the most ratings in the past month.
- Top-rating products: the top 20 products with the highest average ratings of all time.