Skip to content

Predicting the popularity of Reddit posts, using NLP techniques

Notifications You must be signed in to change notification settings

Siddhantmest/Reddit-Post-Popularity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Reddit Post Popularity

The aim of this group project is to predict the popularity of the reddit posts.

Steps followed

1. Import required libraries and create a dataframe of the JSON file
2. Perform data cleaning, text preprocessing, and feature engineering
3. Translate non-English post titles using google translator and then perform sentiment analysis on them
4. Perform analysis on the cleaned data using correlation to derive insights
5. Model is then trained on training data, model performance evaluated on testing data
6. Results verified with the insights

Reddit Post Popularity.ipynb is used to clean the data and Reddit Post Popularity - modeling.ipynb is used to implement a model on the cleaned data.

About

Predicting the popularity of Reddit posts, using NLP techniques

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published