Skip to content

Commit

Permalink
Merge pull request #385 from aindree-2005/tweets
Browse files Browse the repository at this point in the history
Twitter Sentiment Analysis using NLP
  • Loading branch information
abhisheks008 authored Dec 16, 2023
2 parents e7ed2f5 + 2f63236 commit b7461c6
Show file tree
Hide file tree
Showing 13 changed files with 69 additions and 0 deletions.
2 changes: 2 additions & 0 deletions Twitter Sentiment Analysis NLP/Dataset/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
https://www.kaggle.com/datasets/kazanova/sentiment140
Dataset
1 change: 1 addition & 0 deletions Twitter Sentiment Analysis NLP/Images/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
EDA done through line plot, wordcloud, confusion matrix
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
57 changes: 57 additions & 0 deletions Twitter Sentiment Analysis NLP/Models/Readme.Md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Twitter Sentiment Analysis NLP

## PROJECT TITLE

Twitter Sentiment Analysis NLP

## GOAL

The main goal of this project is to analyse the tweets of people using LSTM and Keras Sequential model

## DATASET

https://www.kaggle.com/datasets/kazanova/sentiment140.

## DESCRIPTION

This project aims to perform a sentiment analysis on the tweets posted by various people, and group those into positive and negative tweets.

## WHAT I HAD DONE

1. Used NLTK to preprocess and clean text , using Stemmer, Lemmatizer, removing symbols, etc
2. Created sequential model using Keras, added weight initializers and regulators
3. Used Glove embeddings in other notebook
4. Created LSTM model with Conv1D, Spatial DropOut, Dense and other layers
5. Used confusion matrix
6. Used BERT for classification

## MODELS USED

1. Glove embeddings with LSTM
2. Sequential Model
3. BERT

## LIBRARIES NEEDED
- numpy
- pandas
- sklearn
- tensorflow
- keras
- scipy

## VISUALIZATION

![For Sequential Model](<../Images/Screenshot (277).png>)- keras sequential
![For LSTM](<../Images/Screenshot (279).png>) - lstm

## EVALUATION METRICS

Confusion matrix was created and recall, f1 score, precision were used as metrics of accuracy

## RESULTS

LSTM has higher accuracy (About 78%) compared to 72% of Keras sequential model. The highest accuracy is offered by BERT at 87%

## CONCLUSION

Long Short-Term Memory (LSTM) networks are beneficial in tweet sentiment analysis compared to Keras Sequential models due to their ability to capture contextual dependencies and handle sequential data effectively. Tweets often contain short and informal language, making it challenging for traditional models to discern sentiment accurately. LSTMs, with their memory cells, can capture nuances in the temporal structure of tweets, considering dependencies between words and phrases. This enables LSTMs to grasp the sentiment context better than simple sequential models. In contrast, Keras Sequential models may struggle to capture the inherent sequential nature and intricate dependencies present in tweet data, leading to suboptimal performance in sentiment analysis tasks.Using encoders from Transformer enables BERT to have a better context understanding than traditional neural networks such as LSTM or RNN since the encoder process all inputs, which is the whole sentence, simultaneously so when building a context for a word, BERT will take into account the inputs before it

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Twitter Sentiment Analysis NLP/Models/tweet_lstm.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Twitter Sentiment Analysis NLP/Models/tweetbert.ipynb

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions Twitter Sentiment Analysis NLP/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
numpy
pandas
sklearn
tensorflow
keras
scipy

0 comments on commit b7461c6

Please sign in to comment.