Skip to content

Releases: dask-contrib/dask-deltatable

Dask-deltatable v0.3.3

02 Oct 09:32
3ff657b
Compare
Choose a tag to compare

Fix imports to work with deltalake 0.20 #84

Dask-deltatable v0.3.2

02 Oct 08:25
90f21c9
Compare
Choose a tag to compare

Bug fix

Fix the arguments order in method to_deltalake called in the example (#77)
Fix up some mypy errors (#76)
Sort filenames (#71)
Fix mypy error (#62)

Project hygiene

Synchronise with dask-expr, newer Dask and newer deltalake (#69)
Support auto-setting AWS credentials for storage options (#78)
Compatibility with latest dask, pyarrow and deltalake (#68)
Add path to tokenization (#67)
Clarify readme for reading in deltalake (#66)
Add conda installation instructions to README (#6)
Add URL to setuptools metadata (#60)

Dask-deltatable v0.3.1

24 Jul 17:16
Compare
Choose a tag to compare

This version contains a patch that fixes a problem when reading datasets on a distributed cluster.

Dask-deltatable v0.3

14 Jul 12:39
Compare
Choose a tag to compare

New Features and Enhancements

  • More efficient Dask Graph generation (#24)
  • Transactional write support for append-only write operations with to_deltalake (#29)
  • Reader now supports partition pruning to only load files that match the provided filters (#30)
  • DAT reader acceptance testing against spark generated data (#47)

Breaking changes

Dask and delta-rs integeration

14 Oct 07:21
68dce7f
Compare
Choose a tag to compare

This release builds a wrapper around the Rust package called delta-rs and uses dask for parallel reading.

Features:

  1. Reads the parquet files based on delta logs parallelly using the dask engine
  2. Supports all three filesystems like s3, azurefs, gcsfs
  3. Supports some delta features like
    • Time Travel
    • Schema evolution
    • parquet filters
      • row filter
      • partition filter
  4. Query Delta commit info - History
  5. vacuum the old/ unused parquet files
  6. load different versions of data using DateTime.

DeltaTable reader using Dask

13 Sep 17:41
Compare
Choose a tag to compare
Pre-release

DeltaTable reader using Dask

  1. Reads delta table parallelly using dask
  2. As an Ability to read from different filesystems like S3, Azurefs, gcsfs.
  3. Supports some delta features like
    - Time Travel
    - Schema evolution
    - parquet filters like row and partition filters.