Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Phase 1: Data Retrieval MVP #41

Open
4 tasks
regetz opened this issue Nov 26, 2024 · 1 comment
Open
4 tasks

API Phase 1: Data Retrieval MVP #41

regetz opened this issue Nov 26, 2024 · 1 comment
Assignees
Labels
📘 Epic Targeted capability, feature, or finding D-2.1 Refactored data model and design plans

Comments

@regetz
Copy link

regetz commented Nov 26, 2024

Outcome

Working implementation of API functionality for retrieving "Plot with Observation" record(s) from the VegBank DB deployed to K8s.

Details

Phased approach:

  1. Get a single plot observation using observation accession code (see open questions below)
  2. Get all plot observations
  3. Get a set of plot observations using a list of observation accession codes
  4. Get a set of plot observations using a query

Scope of information to include with each record:

  • An illustrative and informative subset of fields (TBD) from the plot table, including accessionCode
    • Do not return confidentialityReason, realLatitude, realLongitude
  • Associated place(s) as a nested array of objects, with placeName and accessionCode from namedPlace
  • An illustrative and informative subset of fields (TBD) from the observation table, including accessionCode
  • Associated disturbanceObs as a nested array, with disturbanceType, disturbanceIntensity, and disturbanceComment
  • Associated soilTaxon as a nested object, with soilName, soilLevel, and accessionCode
  • Associated soilObs as a nested array, with soilHorizon, soilTexture, and soilDescription
  • Associated project as a nested object, with projectName and accessionCode
    • Don't need to get associated party information for now (deferring to later)
  • Associated coverMethod as a nested object, with coverType and accessionCode
    • Don't need to get associated coverTypes for now (deferring to later)
  • Associated stratumMethod as a nested object, with stratumMethodName and accessionCode
    • Don't need to get associated stratumTypes for now (deferring to later)

Open questions

  • VegBank maintains accessionCode for both plots and observations. Here if we are exposing plot observation resources, then presumably we are working with observation accession codes. And if we are flattening out the returned records in cases where a plot has multiple associated observations (i.e., both plot and observation fields will be top level fields in the JSON record, and plot information will be repeated across multiple associated observation records if they exist), then we will need to disambiguate plot.accessionCode from observation.accessionCode.
  • Extending the previous bullet ... do we anticipate providing a separate set of endpoint for plots by themselves? Implementation is out of scope for this ticket regardless, but we need to discuss it because if the answer is yes, then we should probably make sure we implement the current ticket functionality via a plot_observations (or similarly named) endpoint rather than just plots, because we'll need to reserve the latter for interacting with pure plot resources.

Out of scope?

  • Retrieving other associated resources beyond what's listed above; this includes taxon observations, taxon interpretations, community classifications, parties, references, etc.
  • Handling embargo information
  • Retrieving associated plot observation graphic information
  • Retrieving associated observationSynonym information
  • TBD - Authentication?
  • TBD - Logging?
  • TBD - Performance testing?
  • TBD - Full testing
  • TBD - Limiting/paging results when many records are returned

Acceptance criteria

  • Running service that accepts requests and returns data as specified above
  • Use of JSON as the serialization format all around
  • Testing of extended characters in requests (queries) and returned records
  • Functionality can be demonstrated using a basic client (e.g. curl or Postman)

Related issues

@regetz regetz converted this from a draft issue Nov 26, 2024
@regetz regetz added D-2.1 Refactored data model and design plans 📘 Epic Targeted capability, feature, or finding labels Nov 26, 2024
@regetz
Copy link
Author

regetz commented Nov 27, 2024

Couple of thoughts as we further groom this.

First, I'm leaning toward sticking with JSON for both requests and responses, including returned data. I think it'll be a simpler starting point for this MVP, and maybe even what we do in the end. Some reasoning:

  1. JSON support is so mature, easier to get moving faster
  2. It will naturally accommodate nesting (like species observations within plot observations, and plot observations within plots)
  3. Responses will be simpler bc service-related metadata (including pagination info) and veg data can all be in the same JSON response
  4. It will be more human readable, especially with data sparseness
  5. ... and similarly we can validate the data much more easily than if we used something like CSV
  6. It'll be simpler to add fields, during dev/testing time if not later too

Even looking ahead beyond the MVPs, I could imagine keeping JSON everywhere, but selectively adding a few designated "bulk" API actions that support another format for upload and/or export as an option.

Second, after thinking a bit more, I'm comfortable with punting on auth for the data retrieval MVP work, and potentially even for upcoming data upload MVP work as long as we're only deploying to our internal dev environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📘 Epic Targeted capability, feature, or finding D-2.1 Refactored data model and design plans
Projects
Status: No status
Development

No branches or pull requests

2 participants