Skip to content

Reviews from three companies (IMDb, Yelp and Amazon) are used to train models and predict their labels (good or bad). Models are trained and tested with different combinations of data (as explained in Project_Report.pdf) to see whether the performances of models are company specific or not.

Notifications You must be signed in to change notification settings

nalinda05kl/IMDb_Yelp_Amazon_review_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Effects of Domain Specificity of Cumtomer Reviews on the Model Performance.

Please see the following information about the notebooks included in the capstone project:

This poject was done in order to find the dependency of selected model's performance on reviews from different service providers (Yelp, Amazon & IMDb). I wanted to find whether a selected model's performance depend on the company when classifing good and bad reviews. Because, human beings have certain common properties when giving a good comment or a bad comment disreguard of the company they refer to. Main objective of this project is to find how relaiable a traind model on the reviews 1). Traind on mixed reviews, 2). Trained on company specific reviews. More details can be found in Project_Report.pdf frile.

Basic information about some of the codes used in this analysis are given below. [For more information please send me a message : nalinda05kl@gmail.com]

  1. Customer_Review_Classification_using_ML.ipynb This notebook shows all the main steps used in this analysis includeing pre-processing and modeling

  2. DataVisual.ipynb This includes pre-processing and visualization of raw data.

  3. NivGuess_MulNivBays_SGD.ipynb This notebook shows training and testing of MultinomialNB and SGD classifiers.

  4. CNN_embed.ipynb This notebook shows the training and testing of different CNN architectures with hidden layers.

  5. MNNB.ipynb This note book shows how the optimization of parameters of MultinomialNB was done in order to get the highest accuracy.

  6. All the notebooks bellow shows calculation of accuracy and confusion matrix based on MultinomialNB classifier for different training and testing combinations. Train_Combine_Test_Amazon.ipynb Train_Combine_Test_Yelp.ipynb Train_Combine_Test_imdb.ipynb Train_Test_Amazon.ipynb Train_Test_Imdb.ipynb Train_Test_Yelp.ipynb

  7. Please make sure to include all .txt data files in the same directory with the above notebooks before running.

  8. CNN_with__FreeForm_visualization.ipynb I have added this note book for need of the free form visualization section after the first review of the project.

Thank you. Nalinda Kulathunga.

About

Reviews from three companies (IMDb, Yelp and Amazon) are used to train models and predict their labels (good or bad). Models are trained and tested with different combinations of data (as explained in Project_Report.pdf) to see whether the performances of models are company specific or not.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published