E-commerce recommender based on user events
- Start with eda-uxml.ipynb for exploratory data analysis
- Prepare the data for machine learning with data data-preparation-uxml.ipynb (Optionally test the resulting .npz with sparse-matrix-tester.ipynb)
- Run one of the training algorithms, such as train-uxml-basic-matrix-factorization.ipynb (more to come in the future)
- Test the performance of the training alogrithm against a test set (different from train and validation data) with test-uxml.ipynb or quick-test-uxml.ipynb
- Put the results in practice. Two use case examples are provided use-case-examples.ipynb (item recommender for users and minimalistic stock need prediction to help with e-commerce logistics)
- The Optuna based implementation of HPO: train-uxml-adam-optuna.ipynb
- Applying the hyper-parameters found by Optuna without the need for Optuna framework: train-uxml-adam-from-best-trial.ipynb
- The eda-fe-5months.ipynb represents scaling the orginal dataset for 5 months (instead of the original 2)
- data-cleaning-rees46-purchase.ipynb introduces a new dataset with 123 event types (purchases from 123 different categories) collected between 2020-01-05 and 2020-11-21 by the REES46 Marketing Platform.
- eda-fe-rees46-purchase.ipynb represents the generalization effort with an arbitrary number of event types from the cleaned up dataset
- The data-preparation-test.ipynb can be used to test the efficiency of data preparation, as in comparing the prepared data to the ground truth. This is not needed for the user-behaviour prediction process.
- The quick-test-uxml.ipynb relies on sparse matrix operations to do only MSE, RMSE, MAE, R-squared, and explained variance, for this reason it runs in 1.1s, compared to 175.89s of the full test which has multiple approaches, and many more metrics.
- Machine Learning powered by PyTorchLightning [https://github.com/PyTorchLightning/pytorch-lightning]
- Exploratory Data Analysis powered by Pandas [https://github.com/pandas-dev/pandas]
- Data preparation powered by SciPy [https://github.com/scipy/scipy]
- Testing by Microsoft Best Practices on Recommendation Systems [https://github.com/microsoft/recommenders]
- Source of the data [https://www.kaggle.com/mkechinov/ecommerce-events-history-in-cosmetics-shop] and [https://www.kaggle.com/mkechinov/ecommerce-purchase-history-from-electronics-store] (Thanks to REES46 Marketing Platform for this dataset.)
© Copyright 2020 Peter Szabo