Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new dask-behavior protocol #409

Merged
merged 10 commits into from
Nov 30, 2023

Conversation

agoose77
Copy link
Collaborator

This PR is a suggestion for replacing _dask_array_ with an extension to the descriptor protocol, namely _dask_get, _dask_set, etc.

The motivations for this are to avoid reserving a name within the function signatures, keep property descriptors looking like property descriptors, and to avoid the need to compute the function signature.

Unfortunately, if the fallback case of using map_partitions were implemented naively, we'd need to support a collection type that represents an Any type. As this isn't feasible, this PR uses a heuristic (as we already do) to determine whether the descriptor represents a method (in which case, delaying the call to map_partitions). The heuristic that I chose is to test whether the resolved attribute is callable.

If we are interested in this PR, then I'd probably want to add a fallback for coffea, unless they want to sync up again with coffea's latest release.

@lgray @douglasdavis

Comment on lines 96 to 97
_dask_set: Callable | None = None
_dask_del: Callable | None = None
Copy link
Collaborator Author

@agoose77 agoose77 Nov 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined __selattr__ and __delattr__ although these wouldn't make sense in a functional API. We could come up with a dak.setattr and dak.delattr function that always returns the input array, i.e. _dask_set actually returns self, but I doubt we actually need __delattr__ support.

As such, we probably want to remove these other descriptor methods.

return "this is a non-dask property"

def non_dask_method(self, _dask_array_=None):
return _dask_array_
@some_property.dask_getter
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we drop __setattr__ support, then this can become @some_property.dask

@lgray
Copy link
Collaborator

lgray commented Nov 11, 2023

I'll need to poke at this in coffea to make sure it covers what's needed.

@douglasdavis
Copy link
Collaborator

douglasdavis commented Nov 13, 2023

I'm fine with making a change like this (and I'll add that beyond behaviors "just working", I'd consider @lgray's opinion a more important one than mine w.r.t. the ergonomics, since he originally wrote this as a coffea need)

@lgray
Copy link
Collaborator

lgray commented Nov 13, 2023

should I wait until all this is green to give it a check?

@agoose77
Copy link
Collaborator Author

@lgray should be ready to go now!

@lgray
Copy link
Collaborator

lgray commented Nov 13, 2023

OK - there's something I need to make work first (the multi-return stuff that @iasonkrom found) and then I'll give this a proper looking over.

@lgray
Copy link
Collaborator

lgray commented Nov 13, 2023

@agoose77
Copy link
Collaborator Author

agoose77 commented Nov 13, 2023

https://github.com/dask-contrib/dask-awkward/actions/runs/6853762042/job/18635289824?pr=409#step:5:692 a problem?

I also noticed this! It's unrelated (main) c.f. #413

@agoose77 agoose77 closed this Nov 16, 2023
@agoose77 agoose77 reopened this Nov 16, 2023
@codecov-commenter
Copy link

codecov-commenter commented Nov 16, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (e094ac1) 93.81% compared to head (bb463d8) 93.82%.
Report is 3 commits behind head on main.

Files Patch % Lines
src/dask_awkward/lib/core.py 98.36% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #409   +/-   ##
=======================================
  Coverage   93.81%   93.82%           
=======================================
  Files          23       23           
  Lines        3153     3156    +3     
=======================================
+ Hits         2958     2961    +3     
  Misses        195      195           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@agoose77
Copy link
Collaborator Author

@lgray any thoughts on this? :)

… of github.com:dask-contrib/dask-awkward into agoose77/refactor-dask-behavior-interface
@agoose77
Copy link
Collaborator Author

@lgray I fixed a bug that caused your regression. Would you be happy to test this PR again?

@agoose77
Copy link
Collaborator Author

This passes in coffea, so we can look to merging. @lgray would you be happy to pin >= to the latest dask-awkward?

@lgray
Copy link
Collaborator

lgray commented Nov 29, 2023

Yes - I've already pinned that PR to the next release of dask-awkward. Please go ahead and merge / release!

@agoose77
Copy link
Collaborator Author

@douglasdavis I already pinged you for review but realised I didn't add context: this is good to go, are you happy to sign off on it?

Copy link
Collaborator

@douglasdavis douglasdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a nice API improvement to me!

@agoose77 agoose77 merged commit e78d420 into main Nov 30, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants