This project was cretaed during Spice Academy courses.
Below the arhitecture of the pipeline.
Is composed of 5 containers:
-
Tweet collector: collects tweets and stores them in MongoDB. Uses Tweeter APIs, tweepy and pymongo.
-
ETL: Reads tweets from MongoDB, cleanes data and calcules compound sentiment score (using Vader) and stores the outcome in PSQL, suing sqlAlchemy.
-
Slack bot: Slacks the most positives tweets on a defined Slack channel.
-
MongoDB container
-
PSQL container
- Create Tweeter API and add them to a file called config.py
- Create webhook for slack channel and that to conifg3.py (this part is optional, program can run without the slack_bot_1 container as well)
- Run command 'Docker build' from main dir (where docker-compose file is), in order to build the image for the first time.
- Run command 'Docker-compose up' to start all containers in yml file.