This repository contains a set of components to collect data from social media.
- iceberg-config: Configuration files for configuring Kafka Connect, Nessie and Iceberg.
- youtube-collector: This repository contains a pipeline to collect metadata, comments, thumbnails, transcription and search of YouTube videos by receiving a Video_ID.
- news-collector This repository contains a pipeline to collect news articles.