This project creates a data pipeline that scraps podcast data into a Google Cloud SQL-managed Postgresql database. The Airflow-orchestrated pipeline also uploads the audio files of each podcast episode into a Google Cloud Storage bucket.
GCP resources are provisioned using Terraform.