CS5664_TextPreProcessing_MetaData.ipynb
contains:
- Parsing the metadata and produce essential pandas dataframes.
- Creating product-copurchase edge lists.
CS5664_NetworkAnalysis.ipynb
contains:
- Network information such as degree centrality analysis.
- Node degree distribution.
- Powerlaw and heavy tail distribution.
CS5664_MetaData_and_ProductReviews_Analysis_EGO_Graph_Recommendations.ipynb
contains:
- Text preprocessing.
- Meta data and Product Review processing.
- Sentiment Analysis.
- Topic Modeling.
- EgoGraph based Product Recommendations using product Titles.
CS5664_Product_Recommendations_MachineLearning.ipynb
contains:
- Review rating analysis.
- KNN based SVD experimentation.
- Hyperparameter tuning and training SVD with best params.
- Product recommendations based on reviews.
Datasets:
Appliances.json
- Reviews information for electronic appliances.Magazine_Subscriptions.json
- Reviews information for magazines.amazon-books.csv
- information on products (books) generated fromamazon-meta.txt
.amazon-books-copurchase.edgelist
- information generated fromamazon-meta.txt
contains purchase-copurchase similar edgelist.amazon-meta.txt
- Data downloaded from SNAP.products_copurchases_links.csv
- Purchase Copurchase list for all products generated fromamazon-meta.txt
used for network analysis.products_data.csv
- contains all product information.
Supplementary Materials
& Visualizations
folders include:
- Network Analysis plots.
- WordClouds - Reviews, Sentiments, Topics.
- Sentiment Analysis plots.
- LDA topics.