Skip to content

Uproot 3 → Uproot 4 migration guide

Jim Pivarski edited this page Jan 11, 2023 · 2 revisions

Collecting thoughts on a "Uproot 3 → Uproot 4" cheat sheet/migration guide

Everyone is welcome to contribute!

(I'll write that documentation, but anything you want to make sure gets into such a page, dump it here!)

Length of TTree/TBranchElements

In uproot3, the length (f["/a/path"].__len__(), or len(f["/a/path"])) was corresponding to the actual length of the underlying data (number of entries). In uproot4 the length is the number of sub entries (e.g. subbranches), similar to a Python dict where len() returns the number of keys/values. To get the number of entries, the .num_entries attribute can be used.

See a more detailed explanation in https://github.com/scikit-hep/uproot4/issues/191#issuecomment-726889311

from skhep_testdata import data_path

import uproot as uproot3
import uproot4

branch = "E/Evt"

f3 = uproot3.open(data_path("uproot-issue431b.root"))
f3[branch]
# <TBranchElement b'Evt' at 0x00010f15f5e0>
len(f3[branch])
# 10
f3[branch]["id"].array()
# array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10], dtype=int32)


f4 = uproot4.open(data_path("uproot-issue431b.root"))
f4[branch]
# <TBranchElement 'Evt' (22 subbranches) at 0x00010f37a2b0>

len(f4[branch])
# 22

len(f4[branch].keys())  # note that this will also count nested entries
# 118

f4[branch].keys()
# ['AAObject', 'AAObject/TObject', 'AAObject/TObject/fUniqueID', ...,
#  'mc_trks/mc_trks.hit_ids', 'mc_trks/mc_trks.error_matrix', ...]

len([k for k in f4[branch].keys() if "/" not in k])  # only top-level subbranches
# 22

f4[branch].num_entries
# 10

Caching (TODO)

TODO: The caching has also changed in uproot4.

Custom interpretations

Whenever the automatic serialisation of uproot fails for whatever reason, a custom interpretation comes to the rescue. The concept remained the same but a few details have been changed. Here is an example how a custom jagged array interpretation was utilised in uproot3:

uproot3

import uproot  # version 3
from skhep_testdata import data_path

f = uproot.open(data_path("uproot-issue124.root"))

tree = f["KM3NET_EVENT"]

snapshot_hits = tree["snapshotHits"].array(
    uproot.asjagged(
        uproot.astable(
            uproot.asdtype(
                [
                    ("dom_id", ">i4"),
                    ("channel_id", "u1"),
                    ("time", "<u4"),
                    ("tot", "u1"),
                ]
            )
        ),
        skipbytes=10,
    )
)

This will return a JaggedArray:

>>> snapshot_hits
<JaggedArray [[<Row 0> <Row 1> <Row 2> ... <Row 50> <Row 51> <Row 52>]  ...  [<Row 849> <Row 850> <Row 851> ... <Row 887> <Row 888> <Row 889>] [<Row 890> <Row 891> <Row 892> ... <Row 920> <Row 921> <Row 922>]] at 0x7f9b8e6c89d0>

>>> snapshot_hits.dom_id
<JaggedArray [[808432835 808432835 808432835 ... 809526097 809526097 809526097]   ...  [808432835 808488997 808488997 ... 809526097 809526097 809544061] [808432835 808432835 808432835 ... 809526097 809526097 809544061]] at 0x7f9bc99b05e0>

uproot4

import uproot4 as uproot  # version 4
from skhep_testdata import data_path

f = uproot.open(data_path("uproot-issue124.root"))

tree = f["KM3NET_EVENT"]

snapshot_hits = tree["snapshotHits"].array(
    uproot4.interpretation.jagged.AsJagged(
        uproot4.interpretation.numerical.AsDtype(
            [
                ("dom_id", ">i4"),
                ("channel_id", "u1"),
                ("time", "<u4"),
                ("tot", "u1"),
            ]
        ), header_bytes=10,
    )
)

Which will return an awkward1.Array:

>>> snapshot_hits
<Array [[{dom_id: 808432835, ... tot: 30}]] type='23 * var * {"dom_id": int32, "...'>