Releases: legend-exp/legend-pydataobj
v1.5.1
Important
Releasing this patch version because of a technical problem with PyPI. Please refer to release notes for v1.5.0.
v1.5.0
What's Changed
Warning
The LH5 I/O routines have been refactored! Some function names have changed and new methods for loading and viewing data have been added. Read the migration guide below for more details. This release is fully backward compatible, but deprecation warnings will show up when using the old methods. Upgrade to the new recommended syntax to suppress them.
NEW: the package now offers support for viewing LGDO data (Tables, in particular) as Awkward arrays through the LGDO.view_as()
interface. Awkward Array is a library for nested, variable-sized data, including arbitrary-length lists, records, mixed types, and missing data, using NumPy-like idioms.
Please consult the API documentation on https://legend-pydataobj.readthedocs.io to learn about the new methods.
Migration Guide
Imports
LH5 I/O related routines have been moved to a dedicated subpackage: lgdo.lh5
Old syntax:
from lgdo.lh5_store import LH5Store, ls
store = LH5Store()
ls("file.lh5")
New recommended syntax:
from lgdo import lh5
store = lh5.LH5Store()
lh5.ls("file.lh5")
Read/write LGDOs to disk
Old syntax:
store = LH5Store()
obj, _ = store.read_object("obj", "file.lh5")
store.write_object(obj, "obj", "file.lh5")
New syntax:
store = lh5.LH5Store()
obj, _ = store.read("obj", "file.lh5")
store.write(obj, "obj", "file.lh5")
Convert LGDO to another format
LGDO.view_as()
is the new recommended way to view (i.e. without performing a copy) LGDOs in alternative formats (Pandas, Numpy, Awkward...)
Old syntax:
table = Table(...)
table.get_dataframe()
New syntax:
table.view_as("pd")
Old syntax:
from lgdo.lh5_store import load_nda, load_dfs
load_nda("file.lh5", ["obj"])
load_dfs("file.lh5", ["tbl"])
New syntax:
from lgdo import lh5
lh5.read_as("obj", "file.lh5", library="np")
lh5.read_as("obj", "file.lh5", library="pd")
New syntax (longer alternative):
from lgdo import lh5
store = lh5.LH5Store()
obj, _ = store.read("obj", "file.lh5")
obj.view_as("np")
tbl, _ = store.read("tbl", "file.lh5")
tbl.view_as("pd")
Full list of changes
- Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
- Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
- Support (environment) variables for tweaking Numba at runtime by @gipert in #44
- Add vectorized operations to VectorOfVectors by @iguinn in #42
- Add LGDO format conversion utilities by @MoritzNeuberger in #30
- Added depth option to show and lh5ls by @iguinn in #52
- Reimplement
Table.eval()
, now handlingVectorOfVectors
by @gipert in #53 - Deprecate
load_nda()
andload_dfs()
in favour of.view_as()
by @gipert in #56 - Support setting a fill value when "exploding"
VectorOfVectors
into NumPy arrays in.view_as("np")
by @gipert in #57 - Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
- Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
- Feature:
lh5.read_as()
to read LH5 data straight into third party data views by @gipert in #62 - Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
- Add first version of CITATION.cff by @gipert in #64
- Bug fix in
LH5Store.read()
: check forn_rows
longer thanidx
s before dropping by @ggmarshall in #65 - Bugfix for varlen error msgs and specify nda in view_as "ak" so dtype correctly inferred by @ggmarshall in #67
- Add Patrick to CITATION.cff by @gipert in #68
Table.view_as()
performance fixes by @gipert in #70
New Contributors
- @MoritzNeuberger made their first contribution in #24
Full Changelog: v1.4.2...v1.5.0
v1.5.0a5
What's Changed
- Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
- Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
- Support (environment) variables for tweaking Numba at runtime by @gipert in #44
- Bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 by @dependabot in #43
- Add vectorized operations to VectorOfVectors by @iguinn in #42
- Bump actions/checkout from 2 to 4 by @dependabot in #46
- Bump actions/setup-python from 2 to 5 by @dependabot in #47
- Add LGDO format conversion utilities by @MoritzNeuberger in #30
- Added depth option to show and lh5ls by @iguinn in #52
- chore: update pre-commit hooks by @pre-commit-ci in #51
- Bump actions/upload-artifact from 3 to 4 by @dependabot in #50
- Bump actions/download-artifact from 3 to 4 by @dependabot in #49
- Reimplement
Table.eval()
, now handlingVectorOfVectors
by @gipert in #53 - Deprecate
load_nda()
andload_dfs()
in favour of.view_as()
by @gipert in #56 - Support setting a fill value when "exploding"
VectorOfVectors
into NumPy arrays in.view_as("np")
by @gipert in #57 - Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
- Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
- Feature:
lh5.read_as()
to read LH5 data straight into third party data views by @gipert in #62 - Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
- Add first version of CITATION.cff by @gipert in #64
- Bug fix in
LH5Store.read()
: check forn_rows
longer thanidx
s before dropping by @ggmarshall in #65 - Bugfix for varlen error msgs and specify nda in view_as "ak" so dtype correctly inferred by @ggmarshall in #67
- Bump codecov/codecov-action from 3 to 4 by @dependabot in #66
New Contributors
- @MoritzNeuberger made their first contribution in #24
Full Changelog: v1.4.2...v1.5.0a5
v1.5.0a4
What's Changed
- Support setting a fill value when "exploding"
VectorOfVectors
into NumPy arrays in.view_as("np")
by @gipert in #57 - Migrate to pyproject.toml, upgrade pre-commit config by @gipert in #59
- Fix for reading just first row of VectorOfVectors by @ggmarshall in #63
- Feature:
lh5.read_as()
to read LH5 data straight into third party data views by @gipert in #62 - Added warning when adding a column to a table with different length by @MoritzNeuberger in #58
- Add first version of CITATION.cff by @gipert in #64
- Bug fix in
LH5Store.read()
: check forn_rows
longer thanidx
s before dropping by @ggmarshall in #65
Full Changelog: v1.5.0a3...v1.5.0a4
v1.5.0a3
v1.5.0a2
v1.5.0a1
What's Changed
- Fixed bug in LH5Iterator when number of entries for file is zero by @iguinn in #39
- Refactor of LH5 I/O routines, deprecation of existing methods by @MoritzNeuberger in #24
- Support (environment) variables for tweaking Numba at runtime by @gipert in #44
- Add vectorized operations to VectorOfVectors by @iguinn in #42
- Add LGDO format conversion utilities by @MoritzNeuberger in #30
Minor changes
- Bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 by @dependabot in #43
- Bump actions/checkout from 2 to 4 by @dependabot in #46
- Bump actions/setup-python from 2 to 5 by @dependabot in #47
New Contributors
- @MoritzNeuberger made their first contribution in #24
Full Changelog: v1.4.1...v1.5.0a1
v1.4.2
Full Changelog: v1.4.1...v1.4.2
v1.4.1
What's Changed
- Bug fix: output object type check in zig-zag encoder, needed for loading multiple files by @ggmarshall in #38
New Contributors
- @ggmarshall made their first contribution in #38
Full Changelog: v1.4.0...v1.4.1
v1.4.0
What's Changed
- Enable more fine grained control over
h5py.create_dataset()
options and set default HDF5 compression to shuffle + GZip by @gipert in #34 - improve read_object speed when passing idx by @lvarriano in #35
Full Changelog: v1.3.0...v1.4.0