Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add staging CD jobs #23

Merged
merged 6 commits into from
Apr 8, 2024
Merged

Add staging CD jobs #23

merged 6 commits into from
Apr 8, 2024

Conversation

gwenwindflower
Copy link
Contributor

@gwenwindflower gwenwindflower commented Apr 5, 2024

Adds GHAs to call the Staging Deploy jobs for BQ and SF on merge to staging.

This supports the change to make Staging the default branch in Jaffle Shop (already done), and completes the process of getting us to the following state:

  • 🌉 Feature branches branch off Staging
  • 🌅 Feature branches are PR'd against Staging and run CI jobs via API call across all active adapters (currently SF and BQ) in the Staging Environment
  • 🆕 PRs are merged into Staging, kicking off a CD build job in Staging (our default build command is `dbt build --exclude "resource_type:seed" because seeds are depended on as a source, and running a regular build will create a race condition where some models beat their source seeds into the warehouse and cause failing tests)
  • 🚀 When we're ready for a Jaffle Shop Release we merge Staging into Main kicking off CD jobs via API call in the Prod Environment across all active adapters
  • 🤓 Dev Environments in dbt Cloud auto defer to Staging, as we've set that up as a Staging Environment in Cloud

gwenwindflower and others added 6 commits March 27, 2024 13:56
…or BQ (#18)

Testing CI jobs, we want both BQ and SF to run here.

---------

Co-authored-by: Winnie Winship <winnie@fishtownanalytics.com>
Co-authored-by: dave-connors-3 <73915542+dave-connors-3@users.noreply.github.com>
There is a race condition using multiple warehouses for CI jobs on the
same repo. This adds GitHub Actions to run the CI jobs so we can check
against all of them.

It also adds slim CD jobs that run on merge to main.
Closes #1 

This turned into pulling a sweater thread that unravelled some issues in
the source data, so I had to dig in to quite a bit! 

This updates some issues in the source data seed files in addition to fixing logic.
The Semantic Layer has an awesome feature to allow the construction and
materialization of queries based on your metrics. We've added those to
the Jaffle Shop and expanded the SL implementation a bit.
@joellabes
Copy link
Contributor

  • our default build command is `dbt build --exclude "resource_type:seed" because seeds are depended on as a source, and running a regular build will create a race condition where some models beat their source seeds into the warehouse and cause failing tests

Is this because there are models whose dependencies are not being forced? Wouldn't it be better to force the dependency than babysit which nodes get built outside of dbt itself?

@gwenwindflower
Copy link
Contributor Author

gwenwindflower commented Apr 7, 2024

  • our default build command is `dbt build --exclude "resource_type:seed" because seeds are depended on as a source, and running a regular build will create a race condition where some models beat their source seeds into the warehouse and cause failing tests

Is this because there are models whose dependencies are not being forced? Wouldn't it be better to force the dependency than babysit which nodes get built outside of dbt itself?

so @joellabes seeds run in parallel with everything else, if a source is pointed at a seed's output, and the seed is...seeding into the db still, bad stuff happens. there are no models to force the deps on, as they're sources rather than models. the Big Fix would be if seeds were treated as a proper part of the DAG, which there's an issue for and i've brought up with Doug and Jerco. i think that would make a lot of sense.

i have an alternative fix i'm toying with i'll send you as a Loom, would be interested in your thoughts on it, but it's more of a tradeoff than a fix.

@joellabes
Copy link
Contributor

Wouldn't this work?

#seeds/my_seed.csv
id,name
1,winnie
2,joel
#models/sources.yml
sources:
  - name: seeds
    tables: 
        - name: my_seed
          # etc
-- models/stg_my_seed.sql

-- depends_on: {{ ref('my_seed') }}
with seed as (
  select * from {{ source('seeds', 'my_seed') }}
)
-- etc

Maybe not a good idea as I write it out if this is meant to be a best practice demo

@gwenwindflower
Copy link
Contributor Author

gwenwindflower commented Apr 7, 2024

@joellabes oh yea i think you're right that should work, but yea maybe not ideal for beginners 🤔 hadn't thought of that though

@gwenwindflower gwenwindflower merged commit 9e622f7 into staging Apr 8, 2024
6 checks passed
@gwenwindflower gwenwindflower deleted the feat/add-staging-cd branch April 8, 2024 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants