InsTAP

Purpose of the project

The purpose of the project is to carry out a sentiment analysis on the comments posted by Instagram users in order to evaluate which famous people are more or less loved by the Internet.

Data Pipeline

The components of the pipeline are listed below:

Instap Producer: retrieves data from Instagram using the Instaloader package and sends it to Logstash
Logstash: receives the data from the producer and writes on Kafka's Instap topic.
Kafka: message broker, connects logstash to the Spark processing component.
Spark: received data from Kafka and perform machine learning prediction
Elasticsearch: Indexing incoming data.
Kibana: UI dedicated to Data Visualization.

More technical details in the specific folder, more details on the actual usage in this project in doc.

Requirements

Docker (Desktop on Windows)
Docker Compose
Instagram Account credentials

Usage

Clone the project repository:

git clone https://github.com/rosarioamantia/insTAP

Move to producer folder and edit the producer.env file with your Instagram user credentials, users, number of posts and comments you want to see.
Download spark-3.1.2-bin-hadoop2.7 in spark/setup folder.
In the root repository (called insTAP) run all the docker containers:

docker-compose up

Now, the producer will generate data.
Go to:

localhost:5601

and import visualizations located in kibana/export.ndjson to Left Hambuger menu > Management > Stack Management > Saved Objects > Import.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
consumer		consumer
doc		doc
elasticsearch		elasticsearch
kibana		kibana
logstash		logstash
producer		producer
spark		spark
.env		.env
README.md		README.md
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InsTAP

Purpose of the project

Data Pipeline

Requirements

Usage

About

Releases

Packages

Languages

tapunict/insTAP

Folders and files

Latest commit

History

Repository files navigation

InsTAP

Purpose of the project

Data Pipeline

Requirements

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages