Diachronic Twitter Sentiment Analysis

Natasha Kamtekar | nak142 | 4/24/2020

A link to the project guestbook can be found here

Project Description

This project looks at Twitter data from the internet archive from 2011 and 2019 respectively. It compares and contrasts tweeting habits from both points in time such as content, lexical complexity, tweet length, and tweet sentiment.

How is popular content of either era percieved or talked about?
Have tweets become a medium for displaying more complex sentiment than they used to be?
How does this complexity relate to overall sentiment?

Project Data

The main data used in this project came from a sample of the 2011 and 2019 internet archive JSON files, where the top 1% of tweets were scraped from October of 2011 and September of 2019. There was also a classifier built for the sentiment analysis portion of the analysis which used the open source data from Sentiment140, a pre-existing algorithm for sentiment analysis. The data used in the project can be found here, the classifier data can be found on the Sentiment140 website linked above.

Important documents
Folders
- Data sample
  - 2011 dataset
  - 2019 dataset
- Notebooks
  - build_classifier: build the classifier for sentiment analysis.
  - data_parsing: strips the tweet data to necessary columns
  - data_analysis and classifyanalysis: different stages of the analysis process
  - finalnb: the final jupyter notebook
- Images:
  - Contains images of all plots

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data_samples		data_samples
images		images
notebooks		notebooks
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Natasha’s Progress.pdf		Natasha’s Progress.pdf
Natasha’s Progress.pptx		Natasha’s Progress.pptx
README.md		README.md
final_report.md		final_report.md
progress_report.md		progress_report.md
project_plan.md		project_plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diachronic Twitter Sentiment Analysis

Natasha Kamtekar | nak142 | 4/24/2020

Project Description

Project Data

Table of Contents

About

Releases

Packages

Languages

License

Data-Science-for-Linguists-2020/Twitter-Positivity-Analysis

Folders and files

Latest commit

History

Repository files navigation

Diachronic Twitter Sentiment Analysis

Natasha Kamtekar | nak142 | 4/24/2020

Project Description

Project Data

Table of Contents

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages