Skip to content

Latest commit

 

History

History
32 lines (19 loc) · 1.35 KB

README.md

File metadata and controls

32 lines (19 loc) · 1.35 KB

Text-Analytics

Different exercises and assignments on text mining and analytics using python nltk, gensim, sklearn


Exercise 1:

Donald Trump Tweets - Exploratory data analysis and sentiment analysis

I downloaded President Donald Trumps Tweets from year 2015 to 2019 from http://www.trumptwitterarchive.com/

The purpose of analysis was to figure out the changes in top handles he tweeted, most frequent words he used and sentiments of his tweets, 2 years before he became president and 2 years after he became president.


Exercise 2 :

US presidential speeches - Exploratory data analysis and sentiment analysis

Using python's 'inaugraul' package, I collected and pre-processed all the presidential speeches and grouped have drawn insights based on chronology and the political parties these presidents belog to


Exercise 3:

BBC News articles - Search/Retrival keywords extraction applying LSA and LDA using nltk

There are 2225 BBC news articles pre-classified and labelled into 5 categories of Tech, Sports, Politics, Business and Entertainment

I did not classify these articles like many people do. I instead worked on applying LDA and LSI algorithms using combination of CountVectorizer and TFIDF Vectorizer in order to get top 5 search keywords by controlling for optimum numbers of topics for each articles seperately.