Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update to latest awkward==2.5.0 #407

Merged
merged 15 commits into from
Nov 16, 2023
Merged

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Nov 8, 2023

This PR adds support for awkward==2.5.0's latest features, including attrs and a public typetracer API.

lists,
meta=typetracer_array(ak.Array(lists[0])),
meta=typetracer_array(ak.Array(lists[0], attrs=attrs, behavior=behavior)),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@douglasdavis hmm we might need to add a mechanism for passing a type object and/or enforcing this meta - lists could have any type in lists[1], etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I see what you're getting at here. from_lists is meant to be a pretty simple entry point to instantiating a dask-awkward array from non-disk data. I originally wrote it as a simple testing ground when adding the from_map interface and I felt like it wouldn't be a bad addition to the IO API. My thoughts here are to keep it simple and perhaps add a note to the docstring describing the function as expecting the same type for each list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can make it a runtime error if the subsequent partitions don't have the anticipated meta?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable to me

@agoose77
Copy link
Collaborator Author

agoose77 commented Nov 8, 2023

There's a change in Awkward 2.4.9 that strictly requires np.dtype instances in TypeTracerArray._new. I suspect that will break dask-awkward, and I've included a fix here. TODO: fix main if we decide to make a 2.4.9-supporting fix.

@agoose77 agoose77 changed the title feat: support new attrs feat: update to latest awkward==2.5.0 Nov 13, 2023
@agoose77 agoose77 closed this Nov 15, 2023
@agoose77 agoose77 reopened this Nov 15, 2023
@douglasdavis
Copy link
Collaborator

The conda test will fail until conda-forge's bot grabs the latest awkward so we can ignore that workflow

@@ -47,6 +47,7 @@ def test_list_with_ints_raise(daa: dak.Array) -> None:


def test_single_int(daa: dak.Array, caa: ak.Array) -> None:
daa.eager_compute_divisions()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@douglasdavis can you confirm this is a reasonable fix for the changes to implicit len computation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to avoid eager_compute_divisions on the fixture since it's long-lived, perhaps add a

daa = copy.copy(daa)

above the divisions compute

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, lovely idea.

@codecov-commenter
Copy link

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (5bd87d0) 93.85% compared to head (e96564d) 93.91%.

Files Patch % Lines
src/dask_awkward/lib/structure.py 92.59% 2 Missing ⚠️
src/dask_awkward/lib/core.py 90.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #407      +/-   ##
==========================================
+ Coverage   93.85%   93.91%   +0.05%     
==========================================
  Files          23       23              
  Lines        3125     3153      +28     
==========================================
+ Hits         2933     2961      +28     
  Misses        192      192              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@agoose77
Copy link
Collaborator Author

@douglasdavis I think this is ready to go. I haven't done a pass of adding new _OPFn classes, but that's a behind-the-scenes change.

@agoose77
Copy link
Collaborator Author

We need to add tests, but that's a bigger task. I am thinking towards us merging this to fix things, and do a patch release if we spot anything in our later tests.

@agoose77
Copy link
Collaborator Author

Just doing some last checks actually.

@agoose77
Copy link
Collaborator Author

OK, let's hold this, found some strange behavior.

@douglasdavis douglasdavis self-requested a review November 16, 2023 21:36
Copy link
Collaborator

@douglasdavis douglasdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this another scan- tests are passing and everything looks good to me. When you think it's ready to merge I say go for it.

@agoose77
Copy link
Collaborator Author

Thanks @douglasdavis. I found a bug in Awkward, so we can't fix that here. Yet, it's not a show-stopper. I've bumped our minimum version of awkward to 2.5.0.

@agoose77 agoose77 merged commit 5b2580b into main Nov 16, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants