Skip to content

carlosarcila/autocop_en_distributed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoCop in English to run in Spark

This tool runs supervised sentiment analysis in Spark using the streaming of Twitter. Tweets are filtered by a word or hashtag and are classified in real-time. Positive or negative sentiments are trained with algortithms contained in MlLib. Kakfa and Zookeeper are used to conect to the Twitter stream. Tweets and sentiments are stored in no-Sql MongoDB and can be visualized in real-time. All scripts can run in Amanzon Web Services for Big Data challenges. Before using any of the scripts the models must be trained using the traning script contained in the notebook twitter-spark-model-training.ipynb

To cite this tool:

Arcila, C., Vicente, M., Ortega, F. & Álvarez, M. (2017). Distributed Supervised Sentiment Analysis of Tweets: Integrating Machine Learning and Streaming Analytics for Big Data Challenges in Communication Research. [Technical Report]. Proof of Concept funded by the University of Salamanca Foundation.

About

AutoCop in English to run in Spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published