Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-882] [Feature] Allow base64-service-account-json key auth #923

Closed
3 tasks done
ismailsimsek opened this issue Sep 12, 2023 · 6 comments · Fixed by #1245
Closed
3 tasks done

[ADAP-882] [Feature] Allow base64-service-account-json key auth #923

ismailsimsek opened this issue Sep 12, 2023 · 6 comments · Fixed by #1245
Assignees
Labels
enhancement New feature or request good_first_issue Good for newcomers help_wanted Extra attention is needed

Comments

@ismailsimsek
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt-bigquery functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Allow setting gcp service account as base64 value. currently it only supports setting file path. it should be possible to set base64 SA value using environment variable.

similar to dbt-labs/dbt-snowflake#436
implementation dbt-labs/dbt-snowflake#292
dbt-labs/dbt-core#6018

Describe alternatives you've considered

No response

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

@ismailsimsek ismailsimsek added enhancement New feature or request triage labels Sep 12, 2023
@github-actions github-actions bot changed the title [Feature] Allow base64-service-account-json key auth [ADAP-882] [Feature] Allow base64-service-account-json key auth Sep 12, 2023
@dbeatty10
Copy link
Contributor

Thanks for this suggestion @ismailsimsek !

Currently, there are two service account methods available in dbt-bigquery:

  1. Service Account File (via keyfile)
  2. Service Account JSON (via keyfile_json)

Did you already see the second method?

Assuming so, how would you imagine your suggestion being implemented?

Here's the relevant code within dbt-bigquery that interfaces with BigQuery:

elif method == BigQueryConnectionMethod.SERVICE_ACCOUNT:
keyfile = profile_credentials.keyfile
return creds.from_service_account_file(keyfile, scopes=profile_credentials.scopes)
elif method == BigQueryConnectionMethod.SERVICE_ACCOUNT_JSON:
details = profile_credentials.keyfile_json
return creds.from_service_account_info(details, scopes=profile_credentials.scopes)

Here are the BigQuery docs for the two different methods:

  • from_service_account_file
  • from_service_account_info

@ismailsimsek
Copy link
Author

@dbeatty10 from dbt side, this should work:

# new method 
 elif method == BigQueryConnectionMethod.SERVICE_ACCOUNT_BASE64_JSON: 
     sa_json_base64 = profile_credentials.keyfile_base64_json 
     sa_json_content = base64.b64decode(details_base64.encode("utf-8")).decode("utf-8")
     return creds.from_service_account_info(info=sa_json_content) 

from bigquery lib this method could be used

    def from_service_account_info(cls, info, **kwargs):

@dbeatty10
Copy link
Contributor

What do you think of this @dataders ?

The proposal is to add a third method named keyfile_base64_json that is related to BigQuery service account credentials stored in JSON:

Two alternatives to consider:

  1. Allow Base64-formatted JSON data within keyfile_json and/or keyfile (and then do fancy footwork within the implementation to recognize this case, decode it, and call the applicable method)
  2. Choose not to support Base64-formatted JSON data and direct folks to use pre-existing keyfile_json or keyfile methods instead

@dataders
Copy link
Contributor

@dbeatty10 I support this. In fact, I'd love to mainstream key handling into the default implementation. imo, a single implementation reduces risk. Happy to accept a PR adding support for keyfile_base64_json for the short- and medium-term, but long-term is that this lives in a common implementation

In a larger theme of authentication, I'd love to handle token handling as it related to cloud providers such as AWS and Azure. dbt-labs/dbt-postgres#14 is one example and MSFT adapter family token-handling (source) is another.

@dataders dataders added good_first_issue Good for newcomers help_wanted Extra attention is needed and removed triage labels Sep 25, 2023
@rbaker1
Copy link
Contributor

rbaker1 commented May 16, 2024

I saw this issue and was wondering if you would accept a reimplementation of BigQueryConnectionMethod.SERVICE_ACCOUNT_JSON?

I can implement an is_base64_encoded method that will check if the string is base64 encoded. If it returns True, then the input will be decoded.

    import base64
    import binascii
    
    def is_base64(s: str | bytes) -> bool:
        """
        Checks if the given string or bytes object is valid Base64 encoded.

        Args:
            s: The string or bytes object to check.

        Returns:
            True if the input is valid Base64, False otherwise.
        """

        if isinstance(s, str):
            # For strings, ensure they consist only of valid Base64 characters
            if not s.isascii():
                return False
            # Convert to bytes for decoding
            s = s.encode('ascii') 

        try:
            # Use the 'validate' parameter to enforce strict Base64 decoding rules
            base64.b64decode(s, validate=True)
            return True
        except binascii.Error:  # Catch specific errors from the base64 module
            return False

Then a simple if-else will decode the base64 or leave the string as is.

        elif method == BigQueryConnectionMethod.SERVICE_ACCOUNT_JSON:
            details = profile_credentials.keyfile_json
            if is_base64(profile_credentials.keyfile_json):
                details = base64ToString(details)
            return creds.from_service_account_info(details, scopes=profile_credentials.scopes)

@rbaker1
Copy link
Contributor

rbaker1 commented May 16, 2024

PR has been submitted for review. @dataders

colin-rogers-dbt pushed a commit that referenced this issue Jun 10, 2024
* added base64 functionality and basic testing

* Change log

* fix conftest to allow json

* Change method name from camel to snake case

* change type hinting to be py3.9 compatible

---------

Co-authored-by: Robele Baker <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Good for newcomers help_wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants