Table of Contents
TweeToxicity is a program that analyzes user profiles or hastags based on the recent tweets. The program utilizes machine learning to give Twitter users an appropriate score according to their tweets or retweets. This program is meant for educational purposes and no ill intetions existed prior to creating this program.
Client:
Server:
DataSet Used for Training : Sentiment140
Model | Training | Testing | |||
---|---|---|---|---|---|
Name | Settings | Accuracy | F1 Score | Accuracy | F1 Score |
Logistic Regression | Count Vect. & Lemmatization Used | 79.63% | 0.8008 | 78.71% | 0.7929 |
TF-IDF & Lemmatization Used | 80.31% | 0.8061 | 78.89% | 0.7930 | |
Multinomial Naive Bayes | Count Vect. & Lemmatization Used | 78.48% | 0.7838 | 78.71% | 0.7756 |
TF-IDF & Lemmatization Used | 79.81% | 0.7961 | 76.64% | 0.7664 | |
Bernoulli Naive Bayes | Count Vect. & Lemmatization Used | 78.53% | 0.7875 | 77.71% | 0.7803 |
TF-IDF & Lemmatization Used | 80.15% | 0.8052 | 77.68% | 0.7791 | |
Decision Tree Classifier | Count Vect. & Lemmatization Used, Decision Tree Parameters : {max_depth=50} | 72.91% | 0.7630 | 69.45% | 0.7334 |
TF-IDF & Lemmatization Used, Decision Tree Parameters : {max_depth=50} | 73.66% | 0.7667 | 69.13% | 0.7297 | |
Linear Support Vector Machine | Count Vect. & Lemmatization Used | 82.92% | 0.8309 | 78.38% | 0.7867 |
TF-IDF & Lemmatization Used | 80.36% | 0.8051 | 78.28% | 0.7733 | |
Random Forest Classifier | Count Vect. & Lemmatization Used, Random Forest Parameters : {max_depth=25} | 74.65% | 0.7615 | 74.04% | 0.7566 |
TF-IDF & Lemmatization Used, Random Forest Parameters : {max_depth=25} | 74.80% | 0.7619 | 74.00% | 0.7553 | |
Gradient Boosting Classifier | TF-IDF { min_df=5 } & Lemmatization Used . Gradient Boosting Parameters : {lr=1.25, n=100, depth=25} | 74.80% | 0.7619 | 74.00% | 0.7553 |
TF-IDF { min_df=5 } & Lemmatization Used . Gradient Boosting Parameters : {lr=1.25, n=100, depth=25} | 85.99% | 0.8626 | 77.49% | 0.7791 | |
XGBoost Classifier | Count Vect. & Lemmatization Used | 75.29% | 0.7661 | 75.21% | 0.7662 |
TF-IDF & Lemmatization Used | 75.39% | 0.7683 | 75.14% | 0.7667 |
Clone the project
git clone https://github.com/pri1311/TweeToxicity
Install dependencies in server
folder.
cd server
python -m venv env
source env/bin/activate
pip install -r requirements.txt
Generate environment variables and fill in the values.
cp .env.example .env
Your
.env
is ignored bygit
, which you can see in.gitignore
, and so, it's safe!
Starting Development Server
python server.py
Install dependencies in client
folder.
cd ../client # If you are in ./server
npm i
Starting Client
npm start
At the end of this, you should have
- server running at
http://127.0.0.1:5002/
- new_client running at
http://localhost:3000/
To run this project, you will need to add the following environment variables to your .env file
API_KEY : Twitter API/Consumer Key
API_KEY_SECRET : Twitter API/Consumer Secret
BEARER_TOKEN : Twitter Bearer Token
ACCESS_TOKEN : Twitter Access Token
ACCESS_TOKEN_SECRET : Twitter Access Secret