Skip to content

Releases: legend-exp/legend-pydataobj

v1.5.1

01 Mar 10:36
Compare
Choose a tag to compare

Important

Releasing this patch version because of a technical problem with PyPI. Please refer to release notes for v1.5.0.

v1.5.0

01 Mar 09:35
664881f
Compare
Choose a tag to compare

What's Changed

Warning

The LH5 I/O routines have been refactored! Some function names have changed and new methods for loading and viewing data have been added. Read the migration guide below for more details. This release is fully backward compatible, but deprecation warnings will show up when using the old methods. Upgrade to the new recommended syntax to suppress them.

NEW: the package now offers support for viewing LGDO data (Tables, in particular) as Awkward arrays through the LGDO.view_as() interface. Awkward Array is a library for nested, variable-sized data, including arbitrary-length lists, records, mixed types, and missing data, using NumPy-like idioms.

Please consult the API documentation on https://legend-pydataobj.readthedocs.io to learn about the new methods.

Migration Guide

Imports

LH5 I/O related routines have been moved to a dedicated subpackage: lgdo.lh5

Old syntax:

from lgdo.lh5_store import LH5Store, ls
store = LH5Store()
ls("file.lh5")

New recommended syntax:

from lgdo import lh5
store = lh5.LH5Store()
lh5.ls("file.lh5")

Read/write LGDOs to disk

Old syntax:

store = LH5Store()
obj, _ = store.read_object("obj", "file.lh5")
store.write_object(obj, "obj", "file.lh5")

New syntax:

store = lh5.LH5Store()
obj, _ = store.read("obj", "file.lh5")
store.write(obj, "obj", "file.lh5")

Convert LGDO to another format

LGDO.view_as() is the new recommended way to view (i.e. without performing a copy) LGDOs in alternative formats (Pandas, Numpy, Awkward...)

Old syntax:

table = Table(...)
table.get_dataframe()

New syntax:

table.view_as("pd")

Old syntax:

from lgdo.lh5_store import load_nda, load_dfs
load_nda("file.lh5", ["obj"])
load_dfs("file.lh5", ["tbl"])

New syntax:

from lgdo import lh5
lh5.read_as("obj", "file.lh5", library="np")
lh5.read_as("obj", "file.lh5", library="pd")

New syntax (longer alternative):

from lgdo import lh5
store = lh5.LH5Store()

obj, _ = store.read("obj", "file.lh5")
obj.view_as("np")

tbl, _ = store.read("tbl", "file.lh5")
tbl.view_as("pd")

Full list of changes

  • Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
  • Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
  • Support (environment) variables for tweaking Numba at runtime by @gipert in #44
  • Add vectorized operations to VectorOfVectors by @iguinn in #42
  • Add LGDO format conversion utilities by @MoritzNeuberger in #30
  • Added depth option to show and lh5ls by @iguinn in #52
  • Reimplement Table.eval(), now handling VectorOfVectors by @gipert in #53
  • Deprecate load_nda() and load_dfs() in favour of .view_as() by @gipert in #56
  • Support setting a fill value when "exploding" VectorOfVectors into NumPy arrays in .view_as("np") by @gipert in #57
  • Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
  • Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
  • Feature: lh5.read_as() to read LH5 data straight into third party data views by @gipert in #62
  • Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
  • Add first version of CITATION.cff by @gipert in #64
  • Bug fix in LH5Store.read(): check for n_rows longer than idxs before dropping by @ggmarshall in #65
  • Bugfix for varlen error msgs and specify nda in view_as "ak" so dtype correctly inferred by @ggmarshall in #67
  • Add Patrick to CITATION.cff by @gipert in #68
  • Table.view_as() performance fixes by @gipert in #70

New Contributors

Full Changelog: v1.4.2...v1.5.0

v1.5.0a5

03 Feb 23:12
b83c718
Compare
Choose a tag to compare
v1.5.0a5 Pre-release
Pre-release

What's Changed

  • Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
  • Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
  • Support (environment) variables for tweaking Numba at runtime by @gipert in #44
  • Bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 by @dependabot in #43
  • Add vectorized operations to VectorOfVectors by @iguinn in #42
  • Bump actions/checkout from 2 to 4 by @dependabot in #46
  • Bump actions/setup-python from 2 to 5 by @dependabot in #47
  • Add LGDO format conversion utilities by @MoritzNeuberger in #30
  • Added depth option to show and lh5ls by @iguinn in #52
  • chore: update pre-commit hooks by @pre-commit-ci in #51
  • Bump actions/upload-artifact from 3 to 4 by @dependabot in #50
  • Bump actions/download-artifact from 3 to 4 by @dependabot in #49
  • Reimplement Table.eval(), now handling VectorOfVectors by @gipert in #53
  • Deprecate load_nda() and load_dfs() in favour of .view_as() by @gipert in #56
  • Support setting a fill value when "exploding" VectorOfVectors into NumPy arrays in .view_as("np") by @gipert in #57
  • Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
  • Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
  • Feature: lh5.read_as() to read LH5 data straight into third party data views by @gipert in #62
  • Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
  • Add first version of CITATION.cff by @gipert in #64
  • Bug fix in LH5Store.read(): check for n_rows longer than idxs before dropping by @ggmarshall in #65
  • Bugfix for varlen error msgs and specify nda in view_as "ak" so dtype correctly inferred by @ggmarshall in #67
  • Bump codecov/codecov-action from 3 to 4 by @dependabot in #66

New Contributors

Full Changelog: v1.4.2...v1.5.0a5

v1.5.0a4

30 Jan 13:35
6637343
Compare
Choose a tag to compare
v1.5.0a4 Pre-release
Pre-release

What's Changed

  • Support setting a fill value when "exploding" VectorOfVectors into NumPy arrays in .view_as("np") by @gipert in #57
  • Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
  • Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
  • Feature: lh5.read_as() to read LH5 data straight into third party data views by @gipert in #62
  • Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
  • Add first version of CITATION.cff by @gipert in #64
  • Bug fix in LH5Store.read(): check for n_rows longer than idxs before dropping by @ggmarshall in #65

Full Changelog: v1.5.0a3...v1.5.0a4

v1.5.0a3

11 Jan 16:48
579cea6
Compare
Choose a tag to compare
v1.5.0a3 Pre-release
Pre-release

What's Changed

  • Deprecate load_nda() and load_dfs() in favour of .view_as() by @gipert in #56

Full Changelog: v1.5.0a2...v1.5.0a3

v1.5.0a2

11 Jan 10:29
dd9fb90
Compare
Choose a tag to compare
v1.5.0a2 Pre-release
Pre-release

What's Changed

  • Added depth option to show and lh5ls by @iguinn in #52
  • Reimplement Table.eval(), now handling VectorOfVectors by @gipert in #53

Full Changelog: v1.5.0a1...v1.5.0a2

v1.5.0a1

30 Dec 17:01
021f397
Compare
Choose a tag to compare
v1.5.0a1 Pre-release
Pre-release

What's Changed

  • Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
  • Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
  • Support (environment) variables for tweaking Numba at runtime by @gipert in #44
  • Add vectorized operations to VectorOfVectors by @iguinn in #42
  • Add LGDO format conversion utilities by @MoritzNeuberger in #30

Minor changes

New Contributors

Full Changelog: v1.4.1...v1.5.0a1

v1.4.2

01 Dec 16:36
Compare
Choose a tag to compare

Full Changelog: v1.4.1...v1.4.2

v1.4.1

13 Nov 12:49
ad51868
Compare
Choose a tag to compare

What's Changed

  • Bug fix: output object type check in zig-zag encoder, needed for loading multiple files by @ggmarshall in #38

New Contributors

Full Changelog: v1.4.0...v1.4.1

v1.4.0

10 Nov 15:30
7dfaf79
Compare
Choose a tag to compare

What's Changed

  • Enable more fine grained control over h5py.create_dataset() options and set default HDF5 compression to shuffle + GZip by @gipert in #34
  • improve read_object speed when passing idx by @lvarriano in #35

Full Changelog: v1.3.0...v1.4.0