Skip to content

Releases: dbt-labs/dbt-spark

dbt-spark 0.18.0

18 Sep 17:09
7b3ac5b
Compare
Choose a tag to compare

This is a minor release to support new functionality in dbt v0.18.0 (Marian Anderson).

Quality of life

  • The dbt-spark plugin uses a new testing framework, dbt-adapter-tests. Integration tests run on both Apache Spark + Databricks.
  • For first-time users, dbt init --adapter spark will create ~/.dbt/profiles.yml and populate with Spark-appropriate targets (#98)

Contributors

dbt-spark 0.17.2

05 Aug 21:11
0e93565
Compare
Choose a tag to compare

This is a bugfix release that tracks dbt v0.17.2. It includes no changes to plugin-specific functionality.

dbt-spark 0.17.1

20 Jul 19:13
1bf1ea9
Compare
Choose a tag to compare

This is a bugfix release that tracks dbt v0.17.1. It includes no changes to plugin-specific functionality.

Noteworthy changes

  • dbt-core: dbt native rendering is no longer enabled by default, as it was in v0.17.0. It requires the use of opt-in filters: as_native, as_bool, and as_number. This resolved a v0.17.0 regression for numeric organization when connecting to Azure Databricks. (#2612, #2618)
  • dbt-docs: Fix appearance of relation names with null databases (i.e. all Spark projects) (#96)

dbt-spark 0.17.0

10 Jun 22:40
21ea42d
Compare
Choose a tag to compare

This is a minor release that tracks dbt==0.17.0.

Breaking Changes

  • Always schema, never database. The plugin disallows setting database properties in node configs or target profiles. All values of database are None. In Apache Spark, relational object namespaces have only two components: the object identifier and the database/schema (used interchangeably) in which it is situated. The plugin exclusively uses the schema property to control the behavior related to the database/schema of Spark objects. (#83, #92)

Features

  • Add support for dbt snapshot (Delta Lake only) (#76)

Fixes

  • The auto-generated docs site (#92):
    • Links the data catalog (information gleaned from the database) to the manifest of models, sources, seeds, and snapshots.
    • Includes metadata about object owner and table size (if available). The groundwork of this feature was originally included in the 0.15.3 release; this information is now visible in the docs site.

Quality of life

  • Prettify the README (#82)
  • Improved error if running on an older, unsupported Spark version (#87)

Contributors

dbt-spark 0.16.1

27 Apr 21:08
6d59203
Compare
Choose a tag to compare

This is a bugfix release that tracks dbt==0.16.1.

Fixes

  • dbt docs generate returns an error in dbt-spark==0.16.0 due to a breaking change in dbt-core. This release fixes docs generation by reimplementing catalog methods. We added docs generation as an integration testing step to prevent this breaking change in future releases.

dbt-spark 0.16.0

15 Apr 18:27
Compare
Choose a tag to compare

This is a minor release that tracks dbt==0.16.0. It includes no changes to plugin-specific functionality.

Quality of life

Under the hood

  • Remove duplicated macros, which causes a compilation error in dbt v0.16.0 (#69)
  • Pin dbt-core==0.16.0 (#69)

Contributors

dbt-spark 0.15.3

23 Mar 23:02
b48e144
Compare
Choose a tag to compare

This release contains a wide array of features, fixes, and quality-of-life improvements. It brings the Spark plugin closer to parity with core dbt functionality. It tracks dbt==0.15.3.

Features

  • Add a merge strategy for incremental models stored in the Delta file format (#65)
  • Use create or replace view for atomic replacement of models materialized as views (#65)
  • Include object owner and table statistics in auto-generated docs site (#39, #41)
  • Add location, clustered_by, and persist_docs as model configs (#43)

Fixes

  • Reimplement get_relation to support dropping and recreating objects with custom schema/database (#52)
  • Insert columns in same order as existing table for insert_overwrite incremental strategy (#60)

Quality of life

  • Add docker-compose environment for containerized local Spark. Reimplement integration tests to run in docker (#58, #62, #64)
  • Add support for creating and dropping target schema/database (#40)
  • Faster metadata for multiple relations using show table extended in [database] like '*' (#50, #54)
  • Add an organization config to support Azure Databricks (#34)
  • Allow overriding hard-coded Spark configs with a pre-hook in incremental models (#37)
  • Clearer requirements and instructions for pip installation (#44)

Under the hood

  • Replace JSON Schemas with data classes (#65)
  • Specify upper bound for jinja2 (#56)

Contributors:

Thank you to all members of the dbt + Spark community for your input, insight, and assistance with testing these changes.