Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing a list variable to dbt_utils.surrogate_key in a generic test results in a compilation error. #847

Closed
1 task done
philippefutureboy opened this issue Oct 13, 2023 · 4 comments
Labels
bug Something isn't working Stale triage

Comments

@philippefutureboy
Copy link

philippefutureboy commented Oct 13, 2023

Describe the bug

Passing a list variable to dbt_utils.surrogate_key (currently running 0.9.6) in a generic test results in a compilation error.

Steps to reproduce

You'll find the whole code necessary to reproduce the error below:

test/generic/test_sk_is_client_scoped.sql

{% test sk_is_client_scoped(model, column_name, scope, members) %}
/*
Tests that the given sk column includes the client key
*/

{% set scope_field = {'client': 'client_key', 'franchise': 'franchise_key'}[scope] %}
{% set sk_members = [client_key_field] + members %}
{{ print(sk_members) }}

WITH
model_data AS (
    SELECT
        {{ column_name }} AS sk_field,
        {{ dbt_utils.surrogate_key(sk_members) }} AS sk_test,
        {%- for m in members %}
        {{m}} AS member_{{dbt_utils.slugify(m)}},
        {%- endfor %}
    FROM {{ model }}
),

validation_errors AS (
    SELECT *
    FROM model_data
    WHERE sk_field <> sk_test
)

SELECT *
FROM validation_errors

{% endtest %}

models/orders.yml

version: 2

models:
  - name: orders
    columns:
      # CLIENT ID
      - name: client_key
        data_type: STRING
        description: Client's unique identifier.
        tests:
          - not_null

      # PRIMARY KEY
      - name: sk_orders
        data_type: STRING
        description: 
          Surrogate key, composed from client_key, order_key
        tests:
          - not_null
          - unique
          - sk_is_client_scoped:
              scope: client
              members:
                - order_key

      # DIMENSIONS (FK)
      - name: order_key
        data_type: STRING
        description: >
          Identifier of the order of the row

models/orders.sql

WITH
stg_orders AS (
    SELECT 'cli111111111111111111111111111' AS client_key, 1 AS order_key
    UNION ALL
    SELECT 'cli111111111111111111111111111' AS client_key, 2 AS order_key
    UNION ALL
    SELECT 'cli222222222222222222222222222' AS client_key, 1 AS order_key
    UNION ALL
    SELECT 'cli222222222222222222222222222' AS client_key, 2 AS order_key
),

final AS (
    SELECT
        -- CLIENT ID
        client_key,

        -- PRIMARY KEY
        {{
            dbt_utils.surrogate_key([
                'client_key',
                'order_key'
            ])
        }}
        AS sk_orders,

        -- DIMENSIONS (FK)
        order_key
    FROM data
),

SELECT * FROM final

Expected results

{{ dbt_utils.surrogate_key(sk_members) }} AS sk_test,

should effectively be equal to

{{ dbt_utils.surrogate_key(['client_key', 'order_key']) }} AS sk_test,

which in principle should not raise this dict error.

Actual results

The failing call is the following (see in Steps to reproduce for full generic test):

{{ dbt_utils.surrogate_key(sk_members) }} AS sk_test,

Where sk_members is defined as

{% set scope_field = {'client': 'client_key', 'franchise': 'franchise_key'}[scope] %}
{% set sk_members = [client_key_field] + members %}

and scope, members are custom parameters in the macro signature:

{% test sk_is_client_scoped(model, column_name, scope, members) %}

which are respectively defined as a string, array in the YAML file:

       tests:
          - sk_is_tenant_scoped:
              scope: client
              members:
                - order_key

With this being said, the following statement:

{{ dbt_utils.surrogate_key(sk_members) }} AS sk_test,

raises

INFO - dbt.exceptions.CompilationException: Compilation Error in test sk_is_client_scoped (models/path/to/model.yml)
INFO -   'dict object' has no attribute 'default__hash'
INFO - 
INFO -   > in macro default__hash (macros/cross_db_utils/deprecated/hash.sql)
INFO -   > called by macro hash (macros/cross_db_utils/deprecated/hash.sql)
INFO -   > called by macro default__surrogate_key (macros/sql/surrogate_key.sql)
INFO -   > called by macro surrogate_key (macros/sql/surrogate_key.sql)
INFO -   > called by macro test_sk_is_tenant_scoped (tests/generic/sk_is_client_scoped.sql)
INFO -   > called by macro bigquery__hash (macros/utils/hash.sql)
INFO -   > called by test sk_is_client_scoped (models/path/to/model.yml)

Which is unexpected behaviour.

Furthermore, when printing at line 8 of the generic test:

{{ print(sk_members) }}

the printed result is the following:

INFO - ['client_key', 'order_key']

Which looks very array like to me :0

I've also attempted to change the failing call:

{{ dbt_utils.surrogate_key(sk_members) }} AS sk_test,

to the following variants:

{{ dbt_utils.surrogate_key([*sk_members]) }} AS sk_test, # compilation error, the * operator is not supported
{{ dbt_utils.surrogate_key([m for m in sk_members]) }} AS sk_test, # compilation error, list comprehension is not supported
{{ dbt_utils.surrogate_key(list(sk_members)) }} AS sk_test, # compilation error, list is not defined
{{ dbt_utils.surrogate_key([] + sk_members) }} AS sk_test, # same compilation error as the initial version

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 0.9.6

Which database are you using dbt with?

  • bigquery

The output of dbt --version:

Core:
  - installed: 1.2.6
  - latest:    1.6.6 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 1.2.1 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

I'd upgrade, but it requires that I upgrade a large swath of our dependencies OR that I migrate dbt as a Kubernetes pod outside of our current Airflow setup - either solutions waaay too involved for our team's capacity atm.

Additional context

N/A

Are you interested in contributing the fix?

No, but if you tell me that's a very easy fix I can consider!

@philippefutureboy philippefutureboy added bug Something isn't working triage labels Oct 13, 2023
@asarraf
Copy link
Contributor

asarraf commented Oct 18, 2023

Nit: It should be test/generic/test_sk_is_client_scoped.sql
rather than a .py file test/generic/test_sk_is_client_scoped.py
Right?

@philippefutureboy
Copy link
Author

philippefutureboy commented Nov 6, 2023

@anksu2024 Yes, nice catch! :)
Correction done.

Copy link

github-actions bot commented May 4, 2024

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 4, 2024
Copy link

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale triage
Projects
None yet
Development

No branches or pull requests

2 participants