utils 0.8.1 (#502)

* Fix/timestamp withought timezone (#458) * timestamp and changelog updates * changelog fix * Add context for why change to no timezone Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * also ignore dbt_packages (#463) * also ignore dbt_packages * Update CHANGELOG.md * Update CHANGELOG.md Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * date_spine: transform comment to jinja (#462) * Have union_relations raise exception when include parameter results in no columns (#473) * Raise exception if no columns in column_superset * Add relation names to compiler error message * Add `union_relations` fix to changelog * Added case for handling postgres foreign tables... (#476) * Add link for fewer_rows_than schema test in docs (#465) * Added case for handling postgres foreign tables (tables which are external to current database and are imported into current database from remote data stores by using Foreign Data Wrappers functionallity). * Reworked getting of postges table_type. * Added needed changes to CHANGELOG. Co-authored-by: José Coto <jlcoto@users.noreply.github.com> Co-authored-by: Taras Stetsiak <tstetsiak@health-union.com> * Enhance usability of star macro by only generating column aliases when prefix and/or suffix is specified (#468) * The star macro should only produce column aliases when there is either a prefix or suffix specified. * Enhanced the readme for the star macro. * Add new integration test Co-authored-by: Nick Perrott <nperrott@roiti.com> Co-authored-by: Josh Elston-Green Co-authored-by: Joel Labes <joel.labes@dbtlabs.com> * fix: extra brace typo in insert_by_period_materialization (#480) * Support quoted column names in sequential_values test (#479) * Add any value (#501) * Add link for fewer_rows_than schema test in docs (#465) * Update get_query_results_as_dict example to demonstrate accessing columnar results as dictionary values (#474) * Update get_qu ery_results_as_dict example to demonstrate accessing columnar results as dictionary values * Use slugify in example * Fix slugify example with dbt_utils. package prefix Co-authored-by: Elize Papineau <elize.papineau@dbtlabs.com> * Add note about not_null_where deprecation to Readme (#477) * Add note about not_null_where deprecation to Readme * Add docs to unique_where test * Update pull_request_template.md to reference `main` vs `master` (#496) * Correct coalesce -> concatenation typo (#495) * add any_value cross-db macro * Missing colon in test * Update CHANGELOG.md Co-authored-by: José Coto <jlcoto@users.noreply.github.com> Co-authored-by: Elize Papineau <elizepapineau@gmail.com> Co-authored-by: Elize Papineau <elize.papineau@dbtlabs.com> Co-authored-by: Joe Ste.Marie <stemarie.joe@gmail.com> Co-authored-by: Niall Woodward <niall@niallrees.com> * Fix changelog * Second take at fixing pivot to allow single quotes (#503) * fix pivot : in pivoted column value, single quote must be escaped (on postgresql) else ex. syntax error near : when color = 'blue's' * patched expected * single quote escape : added dispatched version of the macro to support bigquery & snowflake * second backslash to escape in Jinja, change case of test file columns Let's see if other databases allow this * explicitly list columns to compare * different tests for snowflake and others * specific comparison seed * Don't quote identifiers for apostrophe, to avoid BQ and SF problems * Whitespace management for macros * Update CHANGELOG.md Co-authored-by: Marc Dutoo <marc.dutoo@gmail.com> * Add bool or cross db (#504) * Create bool_or cross-db func * Forgot a comma * Update CHANGELOG.md * Code review tweaks Co-authored-by: Joe Markiewicz <74217849+fivetran-joemarkiewicz@users.noreply.github.com> Co-authored-by: Anders <swanson.anders@gmail.com> Co-authored-by: Mikaël Simarik <mikael.simarik@gmail.com> Co-authored-by: Graham Wetzler <graham@wetzler.dev> Co-authored-by: Taras <32882370+Aesthet@users.noreply.github.com> Co-authored-by: José Coto <jlcoto@users.noreply.github.com> Co-authored-by: Taras Stetsiak <tstetsiak@health-union.com> Co-authored-by: nickperrott <46330920+nickperrott@users.noreply.github.com> Co-authored-by: Nick Perrott <nperrott@roiti.com> Co-authored-by: Ted Conbeer <tconbeer@users.noreply.github.com> Co-authored-by: Armand Duijn <armandduijn@users.noreply.github.com> Co-authored-by: Elize Papineau <elizepapineau@gmail.com> Co-authored-by: Elize Papineau <elize.papineau@dbtlabs.com> Co-authored-by: Joe Ste.Marie <stemarie.joe@gmail.com> Co-authored-by: Niall Woodward <niall@niallrees.com> Co-authored-by: Marc Dutoo <marc.dutoo@gmail.com>
dbt-labs · Feb 22, 2022 · 45bd4ad · 45bd4ad
1 parent 4ef456e
commit 45bd4ad
Show file tree

Hide file tree

Showing 30 changed files with 257 additions and 24 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,6 @@
 
 target/
 dbt_modules/
+dbt_packages/
 logs/
-venv/
+venv/
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,3 +1,31 @@
+# dbt-utils v0.8.1
+
+## New features
+- A cross-database implementation of `any_value()` ([#497](https://github.com/dbt-labs/dbt-utils/issues/497), [#501](https://github.com/dbt-labs/dbt-utils/pull/501))
+- A cross-database implementation of `bool_or()` ([#504](https://github.com/dbt-labs/dbt-utils/pull/504))
+
+## Under the hood
+- also ignore `dbt_packages/` directory [#463](https://github.com/dbt-labs/dbt-utils/pull/463)
+- Remove block comments to make date_spine macro compatible with the Athena connector ([#462](https://github.com/dbt-labs/dbt-utils/pull/462))
+
+## Fixes
+- `type_timestamp` macro now explicitly casts postgres and redshift warehouse timestamp data types as `timestamp without time zone`, to be consistent with Snowflake behaviour (`timestamp_ntz`).
+- `union_relations` macro will now raise an exception if the use of `include` or `exclude` results in no columns ([#473](https://github.com/dbt-labs/dbt-utils/pull/473), [#266](https://github.com/dbt-labs/dbt-utils/issues/266)).
+- `get_relations_by_pattern()` works with foreign data wrappers on Postgres again. ([#357](https://github.com/dbt-labs/dbt-utils/issues/357), [#476](https://github.com/dbt-labs/dbt-utils/pull/476))
+- `star()` will only alias columns if a prefix/suffix is provided, to allow the unmodified output to still be used in `group by` clauses etc. [#468](https://github.com/dbt-labs/dbt-utils/pull/468)
+- The `sequential_values` test is now compatible with quoted columns [#479](https://github.com/dbt-labs/dbt-utils/pull/479)
+- `pivot()` escapes values containing apostrophes [#503](https://github.com/dbt-labs/dbt-utils/pull/503)
+
+## Contributors:
+- [grahamwetzler](https://github.com/grahamwetzler) (#473)
+- [Aesthet](https://github.com/Aesthet) (#476)
+- [Kamitenshi](https://github.com/Kamitenshi) (#462)
+- [nickperrott](https://github.com/nickperrott) (#468)
+- [jelstongreen](https://github.com/jelstongreen) (#468)
+- [armandduijn](https://github.com/armandduijn) (#479)
+- [mdutoo](https://github.com/mdutoo) (#503)
+
+
 # dbt-utils v0.8.0
 ## 🚨 Breaking changes
 - dbt ONE POINT OH is here! This version of dbt-utils requires _any_ version (minor and patch) of v1, which means far less need for compatibility releases in the future. 

diff --git a/README.md b/README.md
@@ -742,7 +742,9 @@ group by 1,2,3
 ```
 
 #### star ([source](macros/sql/star.sql))
-This macro generates a list of all fields that exist in the `from` relation, excluding any fields listed in the `except` argument. The construction is identical to `select * from {{ref('my_model')}}`, replacing star (`*`) with the star macro. This macro also has an optional `relation_alias` argument that will prefix all generated fields with an alias (`relation_alias`.`field_name`). The macro also has optional `prefix` and `suffix` arguments, which will be appropriately concatenated to each field name in the output (`prefix` ~ `field_name` ~ `suffix`).
+This macro generates a comma-separated list of all fields that exist in the `from` relation, excluding any fields listed in the `except` argument. The construction is identical to `select * from {{ref('my_model')}}`, replacing star (`*`) with the star macro. This macro also has an optional `relation_alias` argument that will prefix all generated fields with an alias (`relation_alias`.`field_name`). 
+
+The macro also has optional `prefix` and `suffix` arguments. When one or both are provided, they will be concatenated onto each field's alias in the output (`prefix` ~ `field_name` ~ `suffix`). NB: This prevents the output from being used in any context other than a select statement.
 
 **Usage:**
 ```sql

diff --git a/integration_tests/data/cross_db/data_any_value_expected.csv b/integration_tests/data/cross_db/data_any_value_expected.csv
@@ -0,0 +1,4 @@
+key_name,static_col,num_rows
+abc,dbt,2
+jkl,dbt,3
+xyz,test,1
diff --git a/integration_tests/data/cross_db/data_bool_or.csv b/integration_tests/data/cross_db/data_bool_or.csv
@@ -0,0 +1,8 @@
+key,val1,val2
+abc,1,1
+abc,1,0
+def,1,0
+hij,1,1
+hij,1,
+klm,1,0
+klm,1,
diff --git a/integration_tests/data/cross_db/data_bool_or_expected.csv b/integration_tests/data/cross_db/data_bool_or_expected.csv
@@ -0,0 +1,5 @@
+key,value
+abc,true
+def,false
+hij,true
+klm,false
diff --git a/integration_tests/data/sql/data_pivot.csv b/integration_tests/data/sql/data_pivot.csv
@@ -1,4 +1,5 @@
 size,color
 S,red
 S,blue
-M,red
+S,blue's
+M,red
diff --git a/integration_tests/data/sql/data_pivot_expected.csv b/integration_tests/data/sql/data_pivot_expected.csv
@@ -1,3 +1,3 @@
 size,red,blue
 S,1,1
-M,1,0
+M,1,0
diff --git a/integration_tests/data/sql/data_pivot_expected_apostrophe.csv b/integration_tests/data/sql/data_pivot_expected_apostrophe.csv
@@ -0,0 +1,3 @@
+size,red,blue,blues
+S,1,1,1
+M,1,0,0
diff --git a/integration_tests/data/sql/data_star_aggregate.csv b/integration_tests/data/sql/data_star_aggregate.csv
@@ -0,0 +1,5 @@
+group_field_1,group_field_2,value_field
+a,b,1
+a,b,2
+c,d,3
+c,e,4
diff --git a/integration_tests/data/sql/data_star_aggregate_expected.csv b/integration_tests/data/sql/data_star_aggregate_expected.csv
@@ -0,0 +1,4 @@
+group_field_1,group_field_2,value_field_sum
+a,b,3
+c,d,3
+c,e,4
diff --git a/integration_tests/dbt_project.yml b/integration_tests/dbt_project.yml
@@ -60,7 +60,6 @@ seeds:
         # this.incorporate() to hardcode the node's type as otherwise dbt doesn't know it yet
         +post-hook: "{% do adapter.drop_relation(this.incorporate(type='table')) %}"
 
-
     schema_tests:
       data_test_sequential_timestamps:
         +column_types:

diff --git a/integration_tests/models/cross_db_utils/schema.yml b/integration_tests/models/cross_db_utils/schema.yml
@@ -1,6 +1,16 @@
 version: 2
 
 models:
+  - name: test_any_value
+    tests:
+      - dbt_utils.equality:
+          compare_model: ref('data_any_value_expected')
+
+  - name: test_bool_or
+    tests:
+      - dbt_utils.equality:
+          compare_model: ref('data_bool_or_expected')
+
   - name: test_concat
     tests:
       - assert_equal:

diff --git a/integration_tests/models/cross_db_utils/test_any_value.sql b/integration_tests/models/cross_db_utils/test_any_value.sql
@@ -0,0 +1,19 @@
+with some_model as (
+    select 1 as id, 'abc' as key_name, 'dbt' as static_col union all 
+    select 2 as id, 'abc' as key_name, 'dbt' as static_col union all
+    select 3 as id, 'jkl' as key_name, 'dbt' as static_col union all
+    select 4 as id, 'jkl' as key_name, 'dbt' as static_col union all
+    select 5 as id, 'jkl' as key_name, 'dbt' as static_col union all
+    select 6 as id, 'xyz' as key_name, 'test' as static_col
+),
+
+final as (
+    select 
+        key_name, 
+        {{ dbt_utils.any_value('static_col') }} as static_col, 
+        count(id) as num_rows
+    from some_model
+    group by key_name
+)
+
+select * from final
diff --git a/integration_tests/models/cross_db_utils/test_bool_or.sql b/integration_tests/models/cross_db_utils/test_bool_or.sql
@@ -0,0 +1,5 @@
+select 
+    key, 
+    {{ dbt_utils.bool_or('val1 = val2') }} as value
+from {{ ref('data_bool_or' )}}
+group by key
diff --git a/integration_tests/models/sql/schema.yml b/integration_tests/models/sql/schema.yml
@@ -85,6 +85,11 @@ models:
     tests:
       - dbt_utils.equality:
           compare_model: ref('data_pivot_expected')
+
+  - name: test_pivot_apostrophe
+    tests:
+      - dbt_utils.equality:
+          compare_model: ref('data_pivot_expected_apostrophe')
 
   - name: test_unpivot_original_api
     tests:
@@ -111,6 +116,11 @@ models:
       - dbt_utils.equality:
           compare_model: ref('data_star_prefix_suffix_expected')
 
+  - name: test_star_aggregate
+    tests:
+      - dbt_utils.equality:
+          compare_model: ref('data_star_aggregate_expected')
+
   - name: test_surrogate_key
     tests:
       - assert_equal:

diff --git a/integration_tests/models/sql/test_pivot_apostrophe.sql b/integration_tests/models/sql/test_pivot_apostrophe.sql
@@ -0,0 +1,17 @@
+
+-- TODO: How do we make this work nicely on Snowflake too?
+
+{% if target.type == 'snowflake' %}
+    {% set column_values = ['RED', 'BLUE', "BLUE'S"] %}
+    {% set cmp = 'ilike' %}
+{% else %}
+    {% set column_values = ['red', 'blue', "blue's"] %}
+    {% set cmp = '=' %}
+{% endif %}
+
+select
+    size,
+    {{ dbt_utils.pivot('color', column_values, cmp=cmp, quote_identifiers=False) }}
+
+from {{ ref('data_pivot') }}
+group by size
diff --git a/integration_tests/models/sql/test_star_aggregate.sql b/integration_tests/models/sql/test_star_aggregate.sql
@@ -0,0 +1,16 @@
+/*This test checks that column aliases aren't applied unless there's a prefix/suffix necessary, to ensure that GROUP BYs keep working*/
+
+{% set selected_columns = dbt_utils.star(from=ref('data_star_aggregate'), except=['value_field']) %}
+
+with data as (
+
+    select
+        {{ selected_columns }},
+        sum(value_field) as value_field_sum
+
+    from {{ ref('data_star_aggregate') }}
+    group by {{ selected_columns }}
+
+)
+
+select * from data
diff --git a/macros/cross_db_utils/any_value.sql b/macros/cross_db_utils/any_value.sql
@@ -0,0 +1,17 @@
+{% macro any_value(expression) -%}
+    {{ return(adapter.dispatch('any_value', 'dbt_utils') (expression)) }}
+{% endmacro %}
+
+
+{% macro default__any_value(expression) -%}
+
+    any_value({{ expression }})
+
+{%- endmacro %}
+
+
+{% macro postgres__any_value(expression) -%}
+    {#- /*Postgres doesn't support any_value, so we're using min() to get the same result*/ -#}
+    min({{ expression }})
+
+{%- endmacro %}
diff --git a/macros/cross_db_utils/bool_or.sql b/macros/cross_db_utils/bool_or.sql
@@ -0,0 +1,24 @@
+{% macro bool_or(expression) -%}
+    {{ return(adapter.dispatch('bool_or', 'dbt_utils') (expression)) }}
+{% endmacro %}
+
+
+{% macro default__bool_or(expression) -%}
+
+    bool_or({{ expression }})
+
+{%- endmacro %}
+
+
+{% macro snowflake__bool_or(expression) -%}
+
+    boolor_agg({{ expression }})
+
+{%- endmacro %}
+
+
+{% macro bigquery__bool_or(expression) -%}
+
+    logical_or({{ expression }})
+
+{%- endmacro %}
diff --git a/macros/cross_db_utils/datatypes.sql b/macros/cross_db_utils/datatypes.sql
@@ -32,6 +32,10 @@
     timestamp
 {% endmacro %}
 
+{% macro postgres__type_timestamp() %}
+    timestamp without time zone
+{% endmacro %}
+
 {% macro snowflake__type_timestamp() %}
     timestamp_ntz
 {% endmacro %}

diff --git a/macros/cross_db_utils/escape_single_quotes.sql b/macros/cross_db_utils/escape_single_quotes.sql
@@ -0,0 +1,18 @@
+{% macro escape_single_quotes(expression) %}
+      {{ return(adapter.dispatch('escape_single_quotes', 'dbt_utils') (expression)) }}
+{% endmacro %}
+
+{# /*Default to replacing a single apostrophe with two apostrophes: they're -> they''re*/ #}
+{% macro default__escape_single_quotes(expression) -%}
+{{ expression | replace("'","''") }}
+{%- endmacro %}
+
+{# /*Snowflake uses a single backslash: they're -> they\'re. The second backslash is to escape it from Jinja */ #}
+{% macro snowflake__escape_single_quotes(expression) -%}
+{{ expression | replace("'", "\\'") }}
+{%- endmacro %}
+
+{# /*BigQuery uses a single backslash: they're -> they\'re. The second backslash is to escape it from Jinja */ #}
+{% macro bigquery__escape_single_quotes(expression) -%}
+{{ expression | replace("'", "\\'") }}
+{%- endmacro %}
diff --git a/macros/materializations/insert_by_period_materialization.sql b/macros/materializations/insert_by_period_materialization.sql
@@ -53,7 +53,7 @@
 {% materialization insert_by_period, default -%}
   {%- set timestamp_field = config.require('timestamp_field') -%}
   {%- set start_date = config.require('start_date') -%}
-  {%- set stop_date = config.get('stop_date') or '' -%}}
+  {%- set stop_date = config.get('stop_date') or '' -%}
   {%- set period = config.get('period') or 'week' -%}
 
   {%- if sql.find('__PERIOD_FILTER__') == -1 -%}

diff --git a/macros/schema_tests/sequential_values.sql b/macros/schema_tests/sequential_values.sql
@@ -6,13 +6,15 @@
 
 {% macro default__test_sequential_values(model, column_name, interval=1, datepart=None) %}
 
+{% set previous_column_name = "previous_" ~ dbt_utils.slugify(column_name) %}
+
 with windowed as (
 
     select
         {{ column_name }},
         lag({{ column_name }}) over (
             order by {{ column_name }}
-        ) as previous_{{ column_name }}
+        ) as {{ previous_column_name }}
     from {{ model }}
 ),
 
@@ -21,9 +23,9 @@ validation_errors as (
         *
     from windowed
     {% if datepart %}
-    where not(cast({{ column_name }} as {{ dbt_utils.type_timestamp() }})= cast({{ dbt_utils.dateadd(datepart, interval, 'previous_' + column_name) }} as {{ dbt_utils.type_timestamp() }}))
+    where not(cast({{ column_name }} as {{ dbt_utils.type_timestamp() }})= cast({{ dbt_utils.dateadd(datepart, interval, previous_column_name) }} as {{ dbt_utils.type_timestamp() }}))
     {% else %}
-    where not({{ column_name }} = previous_{{ column_name }} + {{ interval }})
+    where not({{ column_name }} = {{ previous_column_name }} + {{ interval }})
     {% endif %}
 )
 

diff --git a/macros/sql/date_spine.sql b/macros/sql/date_spine.sql
@@ -29,16 +29,15 @@
 
 {% macro default__date_spine(datepart, start_date, end_date) %}
 
-/*
-call as follows:
+
+{# call as follows:
 
 date_spine(
     "day",
     "to_date('01/01/2016', 'mm/dd/yyyy')",
     "dateadd(week, 1, current_date)"
-)
+) #}
 
-*/
 
 with rawdata as (
 

diff --git a/macros/sql/get_table_types_sql.sql b/macros/sql/get_table_types_sql.sql
@@ -0,0 +1,22 @@
+{%- macro get_table_types_sql() -%}
+  {{ return(adapter.dispatch('get_table_types_sql', 'dbt_utils')()) }}
+{%- endmacro -%}
+
+{% macro default__get_table_types_sql() %}
+            case table_type
+                when 'BASE TABLE' then 'table'
+                when 'EXTERNAL TABLE' then 'external'
+                when 'MATERIALIZED VIEW' then 'materializedview'
+                else lower(table_type)
+            end as "table_type"
+{% endmacro %}
+
+
+{% macro postgres__get_table_types_sql() %}
+            case table_type
+                when 'BASE TABLE' then 'table'
+                when 'FOREIGN' then 'external'
+                when 'MATERIALIZED VIEW' then 'materializedview'
+                else lower(table_type)
+            end as "table_type"
+{% endmacro %}
diff --git a/macros/sql/get_tables_by_pattern_sql.sql b/macros/sql/get_tables_by_pattern_sql.sql
@@ -8,12 +8,7 @@
         select distinct
             table_schema as "table_schema",
             table_name as "table_name",
-            case table_type
-                when 'BASE TABLE' then 'table'
-                when 'EXTERNAL TABLE' then 'external'
-                when 'MATERIALIZED VIEW' then 'materializedview'
-                else lower(table_type)
-            end as "table_type"
+            {{ dbt_utils.get_table_types_sql() }}
         from {{ database }}.information_schema.tables
         where table_schema ilike '{{ schema_pattern }}'
         and table_name ilike '{{ table_pattern }}'