Replies: 3 comments 3 replies
-
Agree that example 1 is the easiest for me to follow. I like keeping 1, with the option to do 3,4,5 if that is better for your specific workflow. Is there a performance downside keeping 1? |
Beta Was this translation helpful? Give feedback.
-
I agree that most of our issues are rooted in our intention to force the compatibility of equivalent sources with Verde workflows. By nature they are different and this is biting us. Therefore I think we should think about the Verde and the Harmonica decisions as separate, and also make Harmonica's equivalent sources not based in Verde's gridders. I think we have a lot of examples where trying to force this class inheritance was causing issues (inherited methods that didn't work, rewrite of docstrings because of the For Verde I agree with @ThomasMGeo and like having Example 1 as an option. I also like having the possibility to pass the grid coordinates into the method, so we could keep them. For Harmonica I wouldn't like to remove the current way of passing grid coordinates (as 1d linspace arrays) and avoiding the need of the meshgrid. I can see myself overexplaining why we need to do the meshgrid when giving tutorials. So if isn't a good reason to ditch the |
Beta Was this translation helpful? Give feedback.
-
For reference, we discussed this in this week's call: https://github.com/fatiando/community/blob/main/development-calls/2022.md#2022-10-26 The decision was to keep |
Beta Was this translation helpful? Give feedback.
-
The issue
It's been almost a year since fatiando/verde#326 introduced the deprecation that Verde gridders wouldn't generate the grid coordinates inside of the
grid
method. Instead, these should be generated by the user and then passed to the method as inputs.I've now used the new way of doing things quite a bit through the equivalent-sources in Harmonica. To be honest, I don't like having to generate the grid coordinates outside of the method. This code ends up a being a worse experience most of the time in Verde:
versus the new way of doing it:
This is such a common use case for Verde and the first code is so much nicer. The developer in me likes the idea of dependency injection to keep things separate. But when I just want to make a grid, that is getting in the way a bit.
Even then, we're not going all the way since
Spline.grid
still callsverde.make_xarray_grid
. Keeping with that mentality would result in:Even with the projection part, it's not that much harder than the above:
In fact, the above is probably easier to understand than the description of the
projection
argument inverde.Spline.grid
.There is also the case of not generating coordinates but taking them from another grid. The
grid
method takes care of callingnumpy.meshgrid
before callingpredict
:This is the only use case where having the
grid
method in Harmonica makes a bit more sense because we probably don't want tomeshgrid
the upward coordinate as well (to make a volume instead of grid). And that requires a bit more code.Possible ways around this
I get the necessity of passing in coordinates explicitly for the equivalent sources in Harmonica since they don't fit the model Verde was made for because of the extra coordinate. I think the real mistake we made was trying to make the equivalent sources fit the Verde model and have it be usable with something like
verde.Chain
, which is a bit pointless since that's not that common a use case for equivalent sources anyway (notice that it's not used in Harmonica).Here are some ideas of how to move forward with this.
Leave
verde.Spline.grid
as a convenience methodIt's there for convenience to avoid having to generate coordinates, predict, and figure out how to make an
xarray.Dataset
. So let's make it as convenient as possible. Having thecoordinates
argument is still useful since it allows predicting onto an existing grid. This means leaving the optionalcoordinates
argument but undoing the deprecation and keepingExample 1
above a valid way to use Verde.It would be helpful to have an example of doing
Example 3
in the Verde docs to illustrate what thegrid
method does under the hood and in case people prefer that method or want to modify it.In Harmonica, this behaviour shouldn't and currently already isn't allowed. That's fine. They are different packages doing different but related things.
Ditch the
grid
method altogether in HarmonicaI don't feel like it's worth writing code like
Example 2
whenExample 3
already works and is just as verbose. Writinggrid(grid_coordinates, ...)
just feels wrong. For equivalent sources, more thought needs to be put into the coordinates so this makes more sense. The projection case is really not more complicated and could be solved with documentation instead of code.The meshgrid use case is a bit trickier. But maybe we also leave it for an example instead of trying to make it work seamlessly?
This is not the most common use case so we don't need to make it a single line? Having people call
meshgrid
is not ideal but our users are smart and we can provide the example in the docs of how to do it right.Stop inheriting from Verde base classes in Harmonica
For Harmonica, the equivalent sources shouldn't inherit from the Verde base class and don't need to implement the
filter
method. They currently define their own base class based on the Verde one but I'd argue that even this should stop and just inherit directly from scikit-learn. The only thing that would really need to be reimplemented is thescore
method, which is simple.This should still make equivalent sources compatible with
verde.cross_val_score
but it won't be in the future when we have equivalent sources for specific potential fields instead of the generic one we have. It won't be able to handle passing in magnetic field directions, multiple components at different locations, etc. So maybe we just need to make sure the cross validation is more generally usable outside of Verde.Ditch the
grid
method entirely in VerdeThis is the radical approach. Basically, make Examples 3, 4, and 5 the only way to do it.
If it can be done with 2 function calls, then maybe we just let users write the code and provide examples instead of messing with this complication. The methods of gridders would be greatly simplified. They would basically just have
fit
,predict
, andfilter
. Evenscore
is just a wrapper aroundpredict
and a scikit-learn function.This does make things easier to explain at the sacrifice of a bit more code written in the simplest use cases. The general approach to our interpolations is "fit a model and then predict elsewhere", which is then used that way (with the word "predict" instead of "grid") in the code. I kind of like the consistency of this approach. Plus, deleting code is always a good feeling.
With Verde 2.0 in the works, this is the perfect time for a radical departure. Having users write code is not necessarily a bad thing (as long as we provide examples of what they should be doing).
Final thoughts
I'm really not sure where to go and wrote this down mostly to work out the thoughts in my own head. The more I wrote, the more I felt inclined towards the radical options of getting rid of
grid
in both Verde and Harmonica. This is a trade-off between a bit less convenience for a more straight forward code base and documentation.It would be great to have some other views points on this, particularly if it involves workflows we overlooked.
Beta Was this translation helpful? Give feedback.
All reactions