Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map function to slices with index labels #158

Closed
Lincoln-Hannah opened this issue May 16, 2024 · 8 comments
Closed

Map function to slices with index labels #158

Lincoln-Hannah opened this issue May 16, 2024 · 8 comments

Comments

@Lincoln-Hannah
Copy link

Lincoln-Hannah commented May 16, 2024

Ability to map a function to slices of a KeyedArray including the dimension index names of each slice.
Proposed syntax below passes the slice and index names as a named tuple.

function my_func(  (; slice, I, X )  )

    "$I  $X    $(slice(:a))"

end

K  =   KeyedArray(   [1;2;;3;4;;;5;6;;7;8],   A=[:a,:b],   I=[:i,:j],   X=[:x,:y]   )



# proposed syntax
mapslices( my_func,  K;   dims=(:A) )

#would give the same result as 
@tullio Z[i,x] := my_func( (; slice=K[:,i,x], I=K.I[i], X=K.X[x])  )
KeyedArray( Z, I=K.I, X=K.X)
@aplavin
Copy link
Collaborator

aplavin commented May 16, 2024

That's a useful piece of functionality indeed, aside from the fact that mapslices is already defined in Julia and does a slightly different thing.
Proper general interface for such a function should be carefully thought out, ideally one that works for both regular arrays (passing indices) and keyed arrays (passing axiskeys values). Are you looking forward to trying this out in a separate package? This would be a new function anyway, so no piracy concerns.

@Lincoln-Hannah
Copy link
Author

I don't think I'm smart enough to build it myself. Hence the request.

I previously requested similar functionality though based on the DataFramesMeta @rtransform macro. See my last post in #154
(reading it now its not very clear sorry)

In both requests the idea is to select a dimension or dimensions to slice across, then have access to both the underlying data and index labels both within the slice and orthogonal to the slice.

@aplavin
Copy link
Collaborator

aplavin commented May 16, 2024

the idea is to select a dimension or dimensions to slice across ...

That's eachslice(K, dims=(:I, :X)).

... have access to both the underlying data and index labels both within the slice ...

Each of eachslice() elements is a KeyedArray, so they provide access to axiskeys within the slice.

... and orthogonal to the slice.

This part requires an extra step, but is still reasonably possible:

julia> using AxisKeysExtra, RectiGrids

julia> map(with_axiskeys(eachslice(K, dims=(:I,:X)))) do (keys, slice)
       (;total=sum(slice), keys)
       end
2×2 StructArray(::Matrix{Int64}, ::Matrix{@NamedTuple{I::Symbol, X::Symbol}}) with eltype @NamedTuple{total::Int64, keys::@NamedTuple{I::Symbol, X::Symbol}}:
 (total = 3, keys = (I = :i, X = :x))  (total = 11, keys = (I = :i, X = :y))
 (total = 7, keys = (I = :j, X = :x))  (total = 15, keys = (I = :j, X = :y))

@Lincoln-Hannah
Copy link
Author

Genius thank you :)
I'll probably lay it out like this.

eachSliceKeys = with_axiskeys ∘ eachslice 

function my_func( ( (;I,X),  slice)::Pair{NamedTuple,KeyedArray} )  

    "$I  $X    $(slice(:a))"

end


K  =   KeyedArray(   [1;2;;3;4;;;5;6;;7;8],   A=[:a,:b],   I=[:i,:j],   X=[:x,:y]   )


@chain begin 
    K 
    eachSliceKeys(dims=(:I,:X))
    my_func.()
end

@aplavin
Copy link
Collaborator

aplavin commented May 17, 2024

I got your another question in this thread as an email from github, and even though the message is no longer here – see the solution below:

If the slice function returns a KeyedArray, so the result is a nested KeyedArray, is there an easy way to flatten it ?
i.e. convert it to a single level 3D KeyedArray.

Julia has stack() for exactly this purpose, also authored by @mcabbott btw :)
"Just works" with KeyedArrays-of-KeyedArrays:

julia> using DataPipes

# ... all definitions from above...

julia> function my_func2( ( (;I,X),  slice)::Pair{<:NamedTuple,<:KeyedArray} )  
    KeyedArray(  I .* [slice(:a),slice(:b)] , New=[:n1,:n2]  )
end

julia> @p K |> eachSliceKeys(dims=(:I,:X)) |> my_func2.() |> stack
3-dimensional KeyedArray(NamedDimsArray(...)) with keys:
   New  3-element Vector{Symbol}
   I  2-element Vector{Symbol}
◪   X  2-element Vector{Symbol}
And data, 3×2×2 Array{Int64, 3}:
[:, :, 1] ~ (:, :, :x):
         (:i)  (:j)
  (:n1)   1     3
  (:n2)   2     4
  (:n3)   4     8

[:, :, 2] ~ (:, :, :y):
         (:i)  (:j)
  (:n1)   5     7
  (:n2)   6     8
  (:n3)  12    16

@lincolnhannah
Copy link

Thanks:)
I saw stack after I'd posted the question.
Amazing. One word solution!

@Lincoln-Hannah
Copy link
Author

Is there any shorter way to do this ?

allMissing = isempty ∘ skipmissing 

KA = KeyedArray( [1 2 3;1 missing 3; missing missing missing], a=[:a,:b,:c], x=[:x,:y,:z])


@chain begin

     X

    eachslice( dims = :a )

    filter( !allMissing, _ )

    AxisKeys.stack

end

Something like filterslices( X, !allMissing, dims = :a )

@aplavin
Copy link
Collaborator

aplavin commented May 20, 2024

Well, IMO it's not that long, and is easily built out of composable blocks. Note that you don't even need to define the allMissing function separately:

@p let
       X
       eachslice(dims=:a)
       filter(!all(ismissing, _))
       stack
end

A potential alternative I've been thinking about is

using Accessors

@modify(X |> eachslice(_, dims=:a)) do slices
    filter(!allMissing, slices)
end

This isn't implemented for now, but is possible. The idea of Accessors is that you can "modify" any part of the object and conveniently reassemble back – here, we are modifying eachslice(_, dims=:a) of X.
Not sure how useful it is in this specific case though...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants