Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-621] [Bug] impersonate_service_account and job_execution_timeout_seconds #769

Closed
2 tasks done
cmc333333 opened this issue Jun 13, 2023 · 7 comments · Fixed by #882
Closed
2 tasks done

[ADAP-621] [Bug] impersonate_service_account and job_execution_timeout_seconds #769

cmc333333 opened this issue Jun 13, 2023 · 7 comments · Fixed by #882
Labels
bug Something isn't working

Comments

@cmc333333
Copy link
Contributor

Is this a new bug in dbt-bigquery?

  • I believe this is a new bug in dbt-bigquery
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When setting impersonate_service_account along with a high (>1 hr) job_execution_timeout_seconds, the BQ API gives us an error:

Access tokens for service account [service account email here] can not have lifetimes of more than 1 hour. Use the Org Policy Constraint constraints/iam.allowServiceAccountCredentialLifetimeExtension to allow up to 12 hours.

The credentials shouldn't need to live the whole hour+, however, since they can be re-generated whenever making a new BQ call.

Expected Behavior

A high job_execution_timeout_seconds would be compatible with impersonate_service_account without needing to adjust org policy constraints.

Steps To Reproduce

  1. Set up service account impersonation per this doc
  2. Add a three hour timeout (using timeout_seconds rather than job_execution_timeout_seconds to match our actual config)
my-profile:
  target: airflow
  outputs:
    airflow:
      type: bigquery
      method: oauth
      impersonate_service_account: "{{ env_var('DBT_SA_EMAIL') }}"
      project: "{{ env_var('DBT_PROJECT') }}"
      dataset: "dbt"
      retries: 3
      threads: 10
      timeout_seconds: 10800 # 3 hours
      location: US
      priority: interactive
  1. Try to run that target. See an error

Relevant log output

No response

Environment

- OS: Debian GNU/Linux 11 (bullseye)
- Python: Python 3.9.16
- dbt-core: 1.5.1
- dbt-bigquery: 1.5.1

Additional Context

We're trying to switch away from keyfile_json, where we first specified the three hour limit.

@cmc333333 cmc333333 added bug Something isn't working triage labels Jun 13, 2023
@github-actions github-actions bot changed the title [Bug] impersonate_service_account and job_execution_timeout_seconds [ADAP-621] [Bug] impersonate_service_account and job_execution_timeout_seconds Jun 13, 2023
@Fleid
Copy link
Contributor

Fleid commented Jun 22, 2023

Hi @cmc333333,

I've just given a quick read to what we're doing with that timeout settings.
We're using it when fetching results from the query, as expected:

iterator = query_job.result(timeout=job_execution_timeout)

But we're also injecting it as lifetime of the impersonated_credentials:

Number of seconds the delegated credential should be valid for

lifetime=(profile_credentials.job_execution_timeout_seconds or 300),

I don't know why we are doing that to be honest. The parameter is not mandatory, and the default value is its max anyway (3600 sec = 1h).
I did some digging and the initial commit just wires it that way with no explanation that I can find.

Does anything come to mind on your side?

So we could just remove the wiring, or replace it with a new parameter - but I'm wondering if that's really necessary since it's already defaulting to its max.

@cmc333333
Copy link
Contributor Author

Hiya @Fleid ! I'll try to give you my understanding of the world and we can see how that matches up.

  • Impersonation is also known as "short lived access tokens" -- more or less allowing one Google account to do everything a particular service account can do for a limited period of time.
  • If the necessary permission is revoked the token is still valid, so that "short lived" part is pretty important. In fact, Google offers a mechanism for you to reduce the token lifetime when requesting it (from the default 1 hour).
  • Google reused this mechanism to allow the tokens to expire after up to twelve hours, if requested. The catch is that you need to set an "org policy" at the organization, folder, or project level. See "constraints/iam.allowServiceAccountCredentialLifetimeExtension" here for more details. My understanding is that this is for "weird" cases where you can't easily regenerate the auth token.
  • For most cases, we can instead rely on the google-auth Python library (which is used by all of the Google-provided Python SDKs). It knows how to fetch a new token when the earlier one expires. Notably: if the "hey is that BQ job done?" request uses an expired token, the google libraries should retry it for us on their own.

I think that all points to dbt-bigquery's use of "lifetime" as not quite right. I think the simplest fix would be to stop setting the value. If we really wanted, we could have a separate config, but I don't see a use case -- we're relying on Google's SDK throughout, yeah?

@Fleid
Copy link
Contributor

Fleid commented Jun 28, 2023

I'm aligned @cmc333333
Are you interested in opening a PR for that?

@cmc333333
Copy link
Contributor Author

Will give it a go soonish, @Fleid , thanks!

@dbeatty10
Copy link
Contributor

Thanks again for raising this and providing all your keen insights @cmc333333 🧠

Are you still interested in putting up a PR for this, by any chance?

cmc333333 added a commit to cmc333333/dbt-bigquery that referenced this issue Aug 14, 2023
Per dbt-labs#769 , we shouldn't
need to set the lifetime of an impersonated token. In some edge cases,
attempting to set it to the job execution timeout (> 300 seconds) can
result in authentication failures, even though the job will complete
successfully and authentication creds will be correctly refreshed.
@cmc333333
Copy link
Contributor Author

Thanks for the nudge @dbeatty10 ; I need to get a local setup going, but I created a PR and signed the CLA.

@dbeatty10
Copy link
Contributor

Awesome @cmc333333 !

@Fleid I marked the PR (#882) as ready_for_review

@dbeatty10 dbeatty10 removed the triage label Aug 17, 2023
colin-rogers-dbt pushed a commit that referenced this issue Oct 11, 2023
* Remove impersonation lifetime.

Per #769 , we shouldn't
need to set the lifetime of an impersonated token. In some edge cases,
attempting to set it to the job execution timeout (> 300 seconds) can
result in authentication failures, even though the job will complete
successfully and authentication creds will be correctly refreshed.

* Changie data.

---------

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Mike Alfare <13974384+mikealfare@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants