Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 891 Bytes

README.md

File metadata and controls

21 lines (16 loc) · 891 Bytes

How to replicate the project

  1. Get a Kaggle API key and a project inside Google Cloud, download the project credentials to connect via API to gcs and BigQuery.
  2. Clone the repo. Place the credentials inside the project.
  3. Inside the project make a .env file and paste with your information the following:
PROJECT_NAME=your-project-name

GCS_BUCKET_NAME=your-bucket-name

GCLOUD_PROJECT_NAME=your-gcloud-project-name

KAGGLE_USERNAME=your-kaggle-username
KAGGLE_KEY=your-kaggle-key
  1. In your terminal run docker compose build, let it build and then run docker compose up.
  2. In your browser open http://localhost:6789/.
  3. Run the pipeline load_all_raw_data_to_gcs and the run the pipeline load_all_raw_data_to_bq.
  4. Open the project in dbt cloud and run dbt build.
  5. The data should be ready in BigQuery and you could start a report in LookerStudio.