Different exercises and assignments on text mining and analytics using python nltk, gensim, sklearn
Donald Trump Tweets - Exploratory data analysis and sentiment analysis
I downloaded President Donald Trumps Tweets from year 2015 to 2019 from http://www.trumptwitterarchive.com/
The purpose of analysis was to figure out the changes in top handles he tweeted, most frequent words he used and sentiments of his tweets, 2 years before he became president and 2 years after he became president.
US presidential speeches - Exploratory data analysis and sentiment analysis
Using python's 'inaugraul' package, I collected and pre-processed all the presidential speeches and grouped have drawn insights based on chronology and the political parties these presidents belog to
BBC News articles - Search/Retrival keywords extraction applying LSA and LDA using nltk
There are 2225 BBC news articles pre-classified and labelled into 5 categories of Tech, Sports, Politics, Business and Entertainment
I did not classify these articles like many people do. I instead worked on applying LDA and LSI algorithms using combination of CountVectorizer and TFIDF Vectorizer in order to get top 5 search keywords by controlling for optimum numbers of topics for each articles seperately.