Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add updated SMAP python notebooks #57

Closed
wants to merge 2 commits into from
Closed

Add updated SMAP python notebooks #57

wants to merge 2 commits into from

Conversation

jroebuck932
Copy link
Collaborator

I have updated the SMAP python notebooks as follows:

  • To use earthaccess for authentication, searching for and downloading data
  • Moved the three notebooks into the tutorials template
  • Updated to use the latest version of SPL3SMP (version 8)
  • Replaced the use of basemap with cartopy so we can use the nsidc-tutorials environment
    Some other minor things need adding, for details of those see the Jira ticket. Also, I created this branch before pulling the latest version of main, so that may come up when trying to merge.

@github-actions
Copy link

github-actions bot commented Aug 15, 2023

Binder 👈 Launch a binder notebook on this branch for commit 6c65c6c

I will automatically update this comment whenever this PR is modified

Binder 👈 Launch a binder notebook on this branch for commit e581a85

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,244 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be achieved using !mkdir <path-to-data> . For an interactive notebook, is three calls to os overkill.

We should also show how to change this to some other path.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe break these up into how we use them: read data, plotting etc.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #5.    %matplotlib inline

I don't think this "magic" is required. We should check.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace "folders" with Groups to be consistent with how documentation talks about HDF5 data structure.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest splitting this into three cells: 1) open the file to return a File Object (Lines 1 and 3); 2) list groups (Lines 6 to 9); list Datasets (lines 14 and 15)

I wonder if looping through the keys is necessary. https://myhdf5.hdfgroup.org/ is a web-based hdf5 viewer and is a much more convenient way to see the groups and datasets.

h5dump is another command line tool we could use.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.    print(type(ret_flag_L3_P))

I am not sure this is necessary. All Dataset data arrays are numpy.ndarray and this is covered when you read the soil moisture variable.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a "bandaid" for insufficient information in the SMAP HDF5 files. I also do not think it is the correct approach. The data are projected so I think we need to provide the projected coordinates not the geographic coordinates (latitude and longitude) and show how these can be use to plot the data. This is important because CF-Conventions no longer require latitude and longitude but files should have projected coordinates and a CRS.

If we use latitude and longitude there should be a url for these datasets.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the Robinson projection. It is not equal area and is not often used to display geophysical data. It is more of a general purpose projection.

For soil moisture, an equal area projection is probably the better projection to use.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a separate tutorial.


Reply via ReviewNB

@@ -0,0 +1,549 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be streamlined by using xarray.


Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is useful to split the imports by function. Plotting, reading, etc


Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #4.    %matplotlib inline

This magic command should not be required in modern notebooks


Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from pathlib import Path

L3_SM_P_dir = Path.cwd / 'data' / 'L3_SM_P'
flist = L3_SM_P_dir.glob('*.h5')

Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dataset paths instead of keys and indexed lists

f = h5py.File(filename, 'r')
sm_data = f['Soil_Moisture_Retrieval_Data_AM/soil_moisture'][:]
surf_flag = f['Soil_Moisture_Retrieval_Data_AM/surface_flag'][:]

[:] works for complete data arrays.


Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #2.    ret_flag_L3_P = f[group_id]['retrieval_qual_flag'][:,:]

As comment above


Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #4.    ret_flags = {

This should be retrieved from the Dataset attributes. Note that the flag meanings given in this notebook are different from those in the attributes.


Reply via ReviewNB

@@ -0,0 +1,276 @@
{
Copy link
Collaborator

@andypbarrett andypbarrett Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the quality flags, I suggest we get these from the Dataset attributes.


Reply via ReviewNB

@asteiker asteiker self-assigned this Aug 29, 2023
@asteiker asteiker self-requested a review August 29, 2023 15:16
@asteiker asteiker removed their assignment Aug 29, 2023

In this set of three tutorials we demonstrate how to search for, download and plot SMAP data. Tutorial 1 demonstrates how to search for and download SMAP data using the `earthaccess` library. The second tutorial demonstrates how to read in and plot the data downloaded in Tutorial 1. And Tutorial 3 provides information on the surface quality and retrieval quality flags.

We use the [SMAP L3 Radiometer Global Daily 36 km EASE-Grid Soil Moisture, Version 8](https://nsidc.org/data/SPL3SMP/versions/8) data set as an example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We use the [SMAP L3 Radiometer Global Daily 36 km EASE-Grid Soil Moisture, Version 8](https://nsidc.org/data/SPL3SMP/versions/8) data set as an example
We use the [SMAP L3 Radiometer Global Daily 36 km EASE-Grid Soil Moisture, Version 8](https://nsidc.org/data/SPL3SMP/versions/8) data set as an example.


We use the [SMAP L3 Radiometer Global Daily 36 km EASE-Grid Soil Moisture, Version 8](https://nsidc.org/data/SPL3SMP/versions/8) data set as an example

**NOTE** these notebooks are an updated version of the notebooks orginially published in this [repo](https://github.com/nsidc/smap_python_notebooks/tree/main). The notebooks are based on notebooks originally provided to NSIDC by Adam Purdy. Jennifer Roebuck of NSIDC applied the following updates:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andypbarrett Thoughts on how we want to talk across these repos once this is merged into NSIDC-Data-Tutorials? Do we retain smap_python_notebooks with a note that these are no longer supported, or remove/redirect altogether?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betolink The Linux Test Notebooks did not pass: https://github.com/nsidc/NSIDC-Data-Tutorials/pull/57/checks. Do you know if this due to a library discrepancy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andypbarrett This is an excellent deep dive into the problems and work involved to properly geolocate and plot SMAP data. A few overall thoughts for us to consider:

  1. Dividing into at least 2 parts: 1) Search, access using earthaccess; read using h5py 2) Read, geolocate, plot using xarray, pyproj, etc.
  2. The opinions you have expressed are extremely valuable. I see two audiences, however: We need to communicate this back to the SMAP science team to make recommendations, and then there is our end-user audience whom we need to educate and provide guidance on how to work around these deficiencies. Can we separate or refine the two audiences and goals?
  3. Related to 2., for end-user guidance, I'm wondering whether we ought to more heavily promote the option to reformat to CF-compliant netcdf4 using our on-prem API. I would be happy to add that guidance, perhaps as a separate notebook?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do: Merge with Andy's search/access guidance in smap_tutorial_test.ipynb: Utilize mkdir guidance for download path

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To Do: As with 1.0, merge with smap_tutorial_test guidance: Utilize xarray for plotting, and Andy's other guidance for geolocation.

@asteiker asteiker closed this Jan 22, 2024
@asteiker asteiker deleted the cryo-84 branch January 22, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants