- Get a Kaggle API key and a project inside Google Cloud, download the project credentials to connect via API to gcs and BigQuery.
- Clone the repo. Place the credentials inside the project.
- Inside the project make a .env file and paste with your information the following:
PROJECT_NAME=your-project-name
GCS_BUCKET_NAME=your-bucket-name
GCLOUD_PROJECT_NAME=your-gcloud-project-name
KAGGLE_USERNAME=your-kaggle-username
KAGGLE_KEY=your-kaggle-key
- In your terminal run
docker compose build
, let it build and then rundocker compose up
. - In your browser open
http://localhost:6789/
. - Run the pipeline
load_all_raw_data_to_gcs
and the run the pipelineload_all_raw_data_to_bq
. - Open the project in dbt cloud and run
dbt build
. - The data should be ready in BigQuery and you could start a report in LookerStudio.