Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAS-2232 - Functionality added to support SMAP L3 products #15
DAS-2232 - Functionality added to support SMAP L3 products #15
Changes from 6 commits
79b4c3f
789cc6b
c4abf26
be280f0
2a48f93
2c93a4b
dfb1e15
e2ff61f
756f7c0
53f1660
f9f5e8b
f836ee6
c35c8ee
ffe035a
9c9f190
7eda980
f07b544
3b453e5
681b20d
5d609c9
91c51c0
2296a35
2efc4c7
f628166
ebac2a0
3b6d605
802fe0e
60fb22a
631dc24
1e7bc35
36e15c7
5c5eb85
80c2fb2
30eccd0
16872b7
822758f
7883465
dd98e81
de350b6
0666248
e07099a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do not need this function. The equivalent check is just
if len(variable.shape) == 1
(wherevariable
is anetCDF4.Variable
instance).Generally, there are a bunch of nice
numpy
array things that are syntactic sugar you can use for array shape, size, etc. ThenetCDF4
library does a pretty good job of exposing a lot of them onnetCDF4.Variable
objects.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah right I did not need that. I could just get the ndim in the calling method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
756f7c0
removed the method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you want to iterate through all
required_dimensions
, just the ones that are identified as overriding things from thecoordinates
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the override dimensions are passed to the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the names to make it clear
c35c8ee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the variable is 1-D then you are going to end up with an error because
row_size
,col_size
andcrs
are undefined, and the code from L167 onwards will still be executed. Unit tests would likely have caught this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated -(7eda980)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this methodology is pretty complicated, and doing some maths it doesn't need to. It might be simpler to:
3a) If the coordinate is in the same projection as the grid, grab a row slice and a column slice.
3b) If the coordinate is a different projection to the grid (e.g., one is geographic and the other isn't), then project the coordinate variables to grid variables. Then grab a column and a row slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. will change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code block above is likely to cause errors. There is no check for if you manage to find a latitude variable but not a longitude variable (or vice versa). It cannot be assumed that they both will be present. My gut instinct is that this function isn't needed, but if it is, it would be better to catch a missing lat/lon variable here and then raise a human readable error, rather than letting Harmony spit out a complicated and unintuitive stack trace to the end user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes are in commit 756f7c0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to check for where the array values are not the fill value for the variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a ticket to address all the fill value cases. https://bugs.earthdata.nasa.gov/browse/DAS-2236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am mainly suggesting that your
np.where
condition would be better to look fornp.where(lon_row != fill_value)[0]
, wherefill_value
is derived from the metadata of the variable (via varinfo). As it is now, this check would indicate good values for any element that is at least -180.0. If a fill value is a large positive number, then that's a problem.Generally, I think this indicates the work has been split a bit of a weird way. Instead of implementing a first pass of everything, with know issues in several places, it might be better to implement one piece at a time making those pieces correct in one go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the coordinate variables do not have fill value attributes (at least in the test case I am using - SPL3SMP). It may need to be added to the json file.. will check into it. I can add a check for -180 to +180 as w a valid range maybe till we resolve DAS-2236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the better alternative is to see if there is a fill value, if so use a check based around that, otherwise fallback on this alternative check.
I still generally think it's not the best approach to write a function with known issues, and then come back to it later. The more I think about it, the more attractive a feature branch is sounding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this function: READ THIS COMMENT FIRST:
I think this is really overcomplicated. Fundamentally, all this function is doing is getting a list of dimensions for a variable. Then it is looking in a cache to see if there is an index range for any of the dimension names, before using that cache value to tack something on the end of a string.
Maybe I'm being naive, but it feels like really all you need is something to determine the correct list of variable dimensions, then all the rest of the logic (looking in the cache and appending strings to the end of the variable name) is the same. That latter stuff 100% does not need to be in the duplicated in multiple condition branches. It's making this function unnecessarily hard to read.
The other, really important comment: I am super suspicious of the bit where you are needing to
reverse
the order of the dimensions list. However that is derived, it should be reliably in the ordering of the variable axes. I'm wary that what this means is that you have found that for your particular use-case the "random" ordering in a set is predictable for the pseudo-dimensions you have for SMAP L3, and you can coincidentally impose the order you need coincidentally by reversing the set. I really think dimensions should not be passed around in sets, because you need that ordering. I strongly suspect this is the root cause of your issues with 3-D variables.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering and the shape are things I need to get from varinfo. I dont have that information. Updated DAS-2241 for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rephrasing a different comment: I'm very uncomfortable with the thought of merging code with known problems into
main
. Going back to one of the things mentioned in the TRT breakout last week:Instead of making a massive PR that does 80% of a lot of stuff, this probably should be submitted piecemeal, with each piece being made 100% solid in a series of smaller PRs.
If you and David feel we're at a point that we can't break the changes down in that way, then a compromise might be to update this PR to merge into a feature branch. Then once all of the multiple PRs are merged into the feature branch, the feature branch can be merged into
main
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think terminology has confused some of this discussion. The question of reversing is not one of reversing dimensions, but the lat/lon variables coming from the Coordinates attribute. The recommendation here is that the get_coordinates method itself should return in a standard order (reversed in case), based upon the variable name - and not using reverse as shown here.
I don't think this is a case of "Known Issue".
We are planning to release a version for NSIDC to start testing with, which may not handle the 3D cases or other issues, but this release should not break any existing use cases, and should not truly break, but simply not handle as desired. Incremental feature release is a tenet of agile development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes are in commit 756f7c0
I have simplified the method. The order for SMAP is 'projected_y', 'projected_x'. The override section of the code is only used by SMAP at the moment. It can be generalized if we can get that order of dimensions from varinfo. I am not sure if the order of dimensions is used for other products currently handled by hoss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of sets for required_dimensions is based on how it is returned by varinfo and how it was originally in hoss before my changes. the bounds update requires the dimensions to be a set , It fails for a list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a difference between a set of all dimensions for all required variables (as used in the prefetch function), which aggregates all
Variable.dimensions
attributes (which individually are lists), and the dimensions on an individual variable.Variable.dimensions
is not a set for ordering purposes, it is a list.With regards to
bounds
- we know that the pseudo-dimensions won't have abounds
attribute, so you might be better to not try to find bounds variables for them. Then you'll avoid any compatibility issues there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The add_index_range is a lot simpler now