[ADAP-453] [Feature] Overwrite `api.Column.string_type` #665

rlh1994 · 2023-04-13T15:35:15Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt-bigquery functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Currently the core dbt api.Column.string_type is not overwritten, however the default value is not suitable for bigquery data types, which I believe only supports the string type.

Describe alternatives you've considered

Using dbt.type_string, but this isn't suitable in all usecases, see here dbt-labs/dbt-core#7103

Who will this benefit?

Cross-warehouse users.

Are you interested in contributing this feature?

No response

Anything else?

No response

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2023-04-13T17:39:15Z

Thanks for opening this issue @rlh1994 !

I know we discussed this briefly here, but could you provide step-by-step instructions to reproduce the error you are experiencing?

rlh1994 · 2023-04-13T18:03:04Z

Yeah sure, sorry. As a heads up as well this issue is the same in databricks (spark I assume) as well - do you want me to raise a separate issue in that adaptors?

The easiest way to see it is with a basic model that just casts something to the string type:

{{
  config(
    materialized = 'table',
    )
}}

select
    cast(10 as {{api.Column.string_type(4000)}}) as test_col

Which in the case of a BigQuery target leads to the following compiled code



select
    cast(10 as character varying(4000)) as test_col

and a dbt run gives the error output

 dbt run --target bigquery
17:51:06  Running with dbt=1.4.5
17:51:06  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 337 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
17:51:06  
17:51:07  Concurrency: 1 threads (target='bigquery')
17:51:07  
17:51:07  1 of 1 START sql table model dbt_ryan.test_models .............................. [RUN]
17:51:09  BigQuery adapter: https://console.cloud.google.com/bigquery?project=***&page=queryresults
17:51:09  1 of 1 ERROR creating sql table model dbt_ryan.test_models ..................... [ERROR in 1.85s]
17:51:09  
17:51:09  Finished running 1 table model in 0 hours 0 minutes and 2.48 seconds (2.48s).
17:51:09  
17:51:09  Completed with 1 error and 0 warnings:
17:51:09  
17:51:09  Database Error in model test_models (models/test_models.sql)
17:51:09    Syntax error: Expected ")" but got identifier "varying" at [14:26]
17:51:09    compiled Code at target/run/dbt_demo/models/test_models.sql
17:51:09  
17:51:09  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

Because there is no overwritten function, it uses a the core class, which is character varying which doesn't exist in BigQuery, so this api call just errors in bigquery targets.

I could swap out the api.Column.string_type(4000) for type_string() as no one really uses length limits in BigQuery and the default string type is already max size, but as in the other issue type_string() doesn't give the option to specify the length which generates a 256 length string for Redshift, so I can't build a model (or macro) that is easily cross-warehouse and instead have to split the models or dispatch the macro.

dbeatty10 · 2023-04-13T20:34:02Z

Thanks @rlh1994 -- that is a perfect example 🏆

Based on the documentation for the string_type static class method here, we'd expect the following query to work across adapters:

select cast(10 as {{ api.Column.string_type(4000) }})

Suspected root cause

The primary testing for these two methods appears tautological to me (rather than actually testing platform-specific data types).

Next steps

On my side, I've created an issue in dbt-core to add functional tests for the string_type and numeric_type adapter class methods. From there, the implementation of the adapter-specific method(s) should inherit the (new) functional tests from dbt-core to confirm the fix.

Could you raise an issue in dbt-spark as well? Once the fix is merged, dbt-databricks will inherit it.

github-actions · 2023-10-11T01:48:36Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

rlh1994 · 2023-10-11T07:03:41Z

Nope

dbeatty10 · 2023-10-11T15:16:14Z

@rlh1994 I just removed the Stale label.

rlh1994 added enhancement New feature or request triage labels Apr 13, 2023

github-actions bot changed the title ~~[Feature] Overwrite api.Column.string_type~~ [ADAP-453] [Feature] Overwrite api.Column.string_type Apr 13, 2023

dbeatty10 added awaiting_response and removed triage labels Apr 13, 2023

github-actions bot added triage and removed awaiting_response labels Apr 13, 2023

dbeatty10 removed the triage label Apr 13, 2023

dbeatty10 mentioned this issue Feb 13, 2024

[CT-2413] [Bug] Non-functional testing for string_type and numeric_type static class methods dbt-labs/dbt-adapters#81

Closed

6 tasks

rlh1994 mentioned this issue Apr 14, 2023

[ADAP-454] [Bug] api.Column.string_type causes error dbt-labs/dbt-spark#715

Open

2 tasks

dbeatty10 mentioned this issue Apr 17, 2023

Add include_data_types argument to generate_model_yaml macro dbt-labs/dbt-codegen#122

Merged

9 tasks

github-actions bot added the Stale label Oct 11, 2023

dbeatty10 removed the Stale label Oct 11, 2023

Fleid added the help_wanted Extra attention is needed label Feb 22, 2024

mweso-softserve mentioned this issue Jul 3, 2024

redshift__type_string overwrite is causing all existing models to be marked as modified dbt-labs/dbt-project-evaluator#469

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADAP-453] [Feature] Overwrite `api.Column.string_type` #665

[ADAP-453] [Feature] Overwrite `api.Column.string_type` #665

rlh1994 commented Apr 13, 2023

dbeatty10 commented Apr 13, 2023

rlh1994 commented Apr 13, 2023

dbeatty10 commented Apr 13, 2023 •

edited

Loading

github-actions bot commented Oct 11, 2023

rlh1994 commented Oct 11, 2023

dbeatty10 commented Oct 11, 2023

[ADAP-453] [Feature] Overwrite api.Column.string_type #665

[ADAP-453] [Feature] Overwrite api.Column.string_type #665

Comments

rlh1994 commented Apr 13, 2023

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?

dbeatty10 commented Apr 13, 2023

rlh1994 commented Apr 13, 2023

dbeatty10 commented Apr 13, 2023 • edited Loading

Suspected root cause

Next steps

github-actions bot commented Oct 11, 2023

rlh1994 commented Oct 11, 2023

dbeatty10 commented Oct 11, 2023

[ADAP-453] [Feature] Overwrite `api.Column.string_type` #665

[ADAP-453] [Feature] Overwrite `api.Column.string_type` #665

dbeatty10 commented Apr 13, 2023 •

edited

Loading