Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init dataset/layer config for seal tag data pulled from ADC #808

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

trey-stafford
Copy link
Contributor

@trey-stafford trey-stafford commented Mar 21, 2024

Description

This PR adds a vector dataset giving the locations of seal tag measurements (water temp, salinity, depth, etc) from the ADC for demonstration purposes. This layer is relatively small and "easy" to transform into a QGreenland-ready dataset (csv w/ latitude and longitude columns -> Geopackage via ogr2ogr) and may be a good test-case for QGreenland-Net work.

Command to generate an (unzipped, -Z option) QGreenland package with the background layer and this new layer:

$ ./scripts/cli.sh run -Z --include="background" --include "seal_tag_measurements"


Note that there might be one issue with the dataset that isn't so "easy" to deal with: there appear to be rows that have no "Cruise", "Station", "Type", or "Date" values. These rows follow in chunks after rows that do have those values. I'm wondering if the values are "constant" until the next row w/ updated values for those attrs. E.g.,:

"Cruise","Station","Type","Date","Longitude","Latitude","depth_from_surface","Pressure","Temperature","Salinity","Fluorescence","Oxygen"
...
"ct71-01-10",1,"B","2010-08-31 12:00",-43.3587,60.0247,NA,40,-0.475,32.8837,999,999
"",NA,"","",NA,NA,NA,50,-0.6026,32.9236,999,999
"",NA,"","",NA,NA,NA,60,-0.6664,32.9769,999,999
"",NA,"","",NA,NA,NA,80,-0.743,33.1467,999,999
"",NA,"","",NA,NA,NA,100,-0.743,33.2,999,999
"",NA,"","",NA,NA,NA,113,-0.743,33.2,999,999
"ct71-01-10",2,"B","2010-08-31 17:00",-43.2995,60.0515,0,2,5.258,29.966,999,999
"",NA,"","",NA,NA,NA,10,5.1107,30.0879,999,999
"",NA,"","",NA,NA,NA,12,4.9477,30.1562,999,999
"",NA,"","",NA,NA,NA,14,4.8846,30.2049,999,999
"",NA,"","",NA,NA,NA,16,4.7479,30.3,999,999
...

If we end up wanting to actually include this data in QGreenland, we'll want to investigate more and fix if needed.

Checklist

If an item on this list is done or not needed, check it with [x] or click the
checkbox.

  • The PR description links to issues that it resolves with closes #{issue_number}
  • Config lockfile updated (inv config.export > qgreenland/config/cfg-lock.json)
  • Environment lockfile updated if needed (conda-lock)
  • Version bumped if needed (bumpversion (major|minor|patch|prerelease|build))
  • CHANGELOG.md updated (for user-facing changes)
  • Documentation updated if needed
  • New unit tests if needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lockfile provides a json representation of the config that's defined using Python. This allows us to quickly be able to assess the results of config changes against previous state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we define a new dataset for the layer we're adding. I manually pulled this information directly from the ADC dataset landing page.

For our QGreenland-Net processing, this information can be extracted programmatically from the metadata.

dataset = Dataset(
id="seal_tag_measurements",
assets=[
HttpAsset(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dataset has just a single file (asset) we're interested in pulling (the ct71_0DV.csv). Note there are other files associated with the dataset on the ADC landing page, but we don't need those in QGreenland (e.g., a browse .png image).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beginning of the layer configuration. Each subdirectory under layers has a __settings__.py that defines layer and subgroup ordering. More info here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the configuration for the layer itself. This includes the layer title (what it'll be called in the QGIS layers panel), description, style (not currently defined here, but normally we add a style file for each dataset to make it look nice by default), the dataset/asset used as input to create the layer, and the steps required to make it QGreenland-ready.

asset=dataset.assets["only"],
),
steps=[
*ogr2ogr(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper function runs ogr2ogr under the hood and includes default options that we normally want for all of our vector layers (e.g., reproject to our project projection). You can find the implementation for this helper here: https://github.com/nsidc/qgreenland/blob/main/qgreenland/config/helpers/steps/ogr2ogr.py#L18-L46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant