Skip to content

Latest commit

 

History

History
43 lines (28 loc) · 2.3 KB

README.md

File metadata and controls

43 lines (28 loc) · 2.3 KB

#RTwUP

##Realtime Twitter Users' Profiles Given a suitably filtered stream of documents returned by a Twitter query, calculate and show real-time statistics; these consist of the amount of single-time users per hour/day/month.
As well as counting, the system stores User profiles in a repository, adding a so called "snapshot" only if some criteria is met (e.g. changed profile image URL or description or more than 1% increase/decrease in friends or follower amounts).

##Data Stream Description and Requirements: The system uses Twitter APIs (Twitter4j, Hosebird for instance) to perform queries and retrieve Tweets, suitably filters them (e.g. according to specified coordinates or keywords).
If the URL contained in the user profile is in a shortned form of some kind, it's expanded to its original form (performing the expansion multiple times if required).

This must be done in real time, using Storm.

##Adopted Technologies RTwUP is developed in Java.
To listen to Twitter's stream, it was chosen Twitter4j, Twitter Stream API in particular.
To process the Tweets real time, Storm was chosen.
The user interface is written in Javascript as a Node.js application, making use of socket.io and Redis to display statistics in real time.
Persistence of the retrieved Twitter User Profiles is obtained by means of a repository based on Elasticsearch. For more information, you can refer to the wiki pages.

##Wiki