Skip to content

h5py group

Xu Gang edited this page Jan 21, 2021 · 7 revisions

Groups

List

Groups are the container mechanism by which HDF5 files are organized. From a Python perspective, they operate somewhat like dictionaries. In this case the “keys” are the names of group members, and the “values” are the members themselves (Group and Dataset) objects.

Group objects also contain most of the machinery which makes HDF5 useful. The File object does double duty as the HDF5 root group, and serves as your entry point into the file:

>>> f = h5py.File('foo.hdf5','w')
>>> f.name
'/'
>>> list(f.keys())
[]

Creating groups

New groups are easy to create:

>>> grp = f.create_group("bar")
>>> grp.name
'/bar'
>>> subgrp = grp.create_group("baz")
>>> subgrp.name
'/bar/baz'
# Multiple intermediate groups can also be created implicitly:

>>> grp2 = f.create_group("/some/long/path")
>>> grp2.name
'/some/long/path'
>>> grp3 = f['/some/long']
>>> grp3.name
'/some/long'

Dict interface and links

Groups implement a subset of the Python dictionary convention. They have methods like keys(), values() and support iteration. Most importantly, they support the indexing syntax, and standard exceptions:

subgrp["MyDS"]='MyDS'
>>> myds = subgrp["MyDS"]
>>> missing = subgrp["missing"]
KeyError: "Name doesn't exist (Symbol table: Object not found)"
# Objects can be deleted from the file using the standard syntax:

>>> del subgroup["MyDataset"]

Hard links

What happens when assigning an object to a name in the group? It depends on the type of object being assigned. For NumPy arrays or other data, the default is to create an HDF5 datasets:

>>> grp["name"] = 42
>>> out = grp["name"]
>>> out
<HDF5 dataset "name": shape (), type "<i8">
# When the object being stored is an existing Group or Dataset, a new link is made to the object:

>>> grp["other name"] = out
>>> grp["other name"]
<HDF5 dataset "other name": shape (), type "<i8">

# Note that this is not a copy of the dataset! Like hard links in a UNIX file system, objects in an HDF5 file can be stored in multiple # groups:
 grp['other name'] == grp['name']

>>> f["other name"] == f["name"]
True

Soft links

Also like a UNIX filesystem, HDF5 groups can contain “soft” or symbolic links, which contain a text path instead of a pointer to the object itself. You can easily create these in h5py by using h5py.SoftLink:

>>> myfile = h5py.File('foo.hdf5','w')
>>> group = myfile.create_group("somegroup")
>>> myfile["alias"] = h5py.SoftLink('/somegroup')
#  the target is removed, they will “dangle”:

>>> del myfile['somegroup']
>>> print(myfile['alias'])
KeyError: 'Component not found (Symbol table: Object not found)'

External links

New in HDF5 1.8, external links are “soft links plus”, which allow you to specify the name of the file as well as the path to the desired object. You can refer to objects in any file you wish. Use similar syntax as for soft links:

>>> myfile = h5py.File('foo.hdf5','w')
>>> myfile['ext link'] = h5py.ExternalLink("otherfile.hdf5", "/path/to/resource")

When the link is accessed, the file “otherfile.hdf5” is opened, and object at “/path/to/resource” is returned.

Since the object retrieved is in a different file, its “.file” and “.parent” properties will refer to objects in that file, not the file in which the link resides.

Clone this wiki locally