-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Samples data #741
base: develop
Are you sure you want to change the base?
Samples data #741
Conversation
seeing how readme gets rendered without image (we'll try to get it back though!)
…gered manually. I think there are some rate-limiting features going on in the CI job
Fix WQP siteInfo attribute bug
Merge branch 'main' of https://github.com/DOI-USGS/dataRetrieval into samples_data # Conflicts: # R/readWQPdata.R
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool MR! Left some comments, but generally find the new functions and vignette easy to follow and use.
Weird observation, but I ran this:
test <- read_USGS_samples(pointLocationLatitude = 43.1206415, pointLocationLongitude = -89.3691616, pointLocationWithinMiles = 15, activityStartDateLower = "2024-10-01")
And it failed. Then I looked at the vignette and matched the vignette's example:
test <- read_USGS_samples(pointLocationLatitude = 43.1206415, pointLocationLongitude = -89.3691616, pointLocationWithinMiles = 15, activityStartDateLower = "2024-10-01", dataType = "Monitoring locations", dataProfile = "Site")
And it worked. Then I tried the first query again, and it worked. Not sure if the ordering of the queries matters at all, but I found it peculiar enough to share. Could have totally been coincidence.
tests/testthat/tests_samples.R
Outdated
|
||
|
||
context("samples-data samples") | ||
test_that("samples-data samples working", { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_that("samples-data samples working", { | |
test_that("samples-data activities working", { |
|
||
expect_true(all(c( | ||
"Org_Identifier", | ||
"Org_FormalName" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you pick these columns because they are the first ones in the returned df if completed successfully/as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Could be more thorough, but that's it.
# no_sf = TRUE, | ||
# warn = TRUE) | ||
# ) | ||
# }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NLDI recently switched to api.gov (something like that at least). That means there's now rate limited...which means if I hit the same request a bunch of times, it errors out. We're going to have to figure out how to deal with this eventually (both for NLDI and all future new services)....but that's a problem for another day. In the meantime, Dave recommended commenting this out for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ldecicco-USGS can we create an Issue to track this? Just to make sure future selves remember.
R/readUSGSsample.R
Outdated
#' dataType = "Projects") | ||
#' | ||
#' } | ||
read_USGS_samples <- function(monitoringLocationIdentifier = NA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but suggest naming file read_USGS_samples.R
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, decided on the snake case after starting the work. Renaming...
R/readUSGSsample.R
Outdated
#' See also: \url{https://api.waterdata.usgs.gov/samples-data/docs}. | ||
#' | ||
#' @param monitoringLocationIdentifier A monitoring location identifier has two parts: the agency code | ||
#' and the location number. Location identifiers should be separated with commas, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' and the location number. Location identifiers should be separated with commas, | |
#' and the location number, separated by a dash (-). Location identifiers should be separated with commas, |
vignettes/samples_data.Rmd
Outdated
|
||
## Return data types | ||
|
||
There are 2 arguments that dictate what kind of data is returned: dataType and dataProfile. The "dataType" argument define what kind of data comes back, and the "dataProfile" define what columns from that type come back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 arguments that dictate what kind of data is returned: dataType and dataProfile. The "dataType" argument define what kind of data comes back, and the "dataProfile" define what columns from that type come back. | |
There are 2 arguments that dictate what kind of data is returned: dataType and dataProfile. The "dataType" argument defines what kind of data comes back, and the "dataProfile" defines what columns from that type come back. |
R/readUSGSsample.R
Outdated
#' with pointLocationLatitude and pointLocationLongitude | ||
#' @param dataType Options include: "Results", "Monitoring locations", "Activities", | ||
#' "Projects", and "Organizations". | ||
#' @param dataProfile Profile depends on type. Options for "Results" dataType are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing some? From Samples API: Available values: fullphyschem, basicphyschem, fullbio, basicbio, narrow, resultdetectionquantitationlimit, labsampleprep, count
I guess you're matching what's here: https://waterdata.usgs.gov/download-samples/#dataProfile=fullphyschem ?
R/readUSGSsample.R
Outdated
#' "Site" and "Count". Options for "Activities" are "Sample Activities", | ||
#' "Activity Metrics", "Activity Groups", "Count". Options for "Projects" are: | ||
#' "Project" and "Project Monitoring Location Weight". Options for "Organizations" are: | ||
#' "Organization" and "Count". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate matching the services and profiles with the Samples download interface, but I also would prefer to simply match the actual service names as outlined here: https://api.waterdata.usgs.gov/samples-data/docs#/ It's sometimes onerous to remember what is capitalized, what is not, where there are spaces, etc. I admit some impatience here, but I'd prefer the inputs to match what actually shows up in the URL generated at the bottom of the Samples GUI. That also saves you a little bit of effort in the code as well.
R/readUSGSsample.R
Outdated
#' @param characteristic Characteristic is a specific category describing the sample. | ||
#' See available options by running | ||
#' \code{check_param("characteristics")$characteristicName}. | ||
#' @param stateFips State query parameter. To get a list of available state fips, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a nice example of this in the vignette, but would be helpful to show the general format here as well: "FIPs codes for states take the format: CountryAbbrev:StateNumber, like US:55 for Wisconsin." I got a little turned around trying to write my own function query. First I went to check_params("state")
and then wasn't sure why I'd also need to do stateCdLookup()
, but then realized I needed the correct fips format, which can be extracted using stateCdLookup()
, but I needed to look at the help to understand that the correct function parameterization is stateCdLookup("WI", "fips")
.
R/readUSGSsample.R
Outdated
#' \code{stateCdLookup}. | ||
#' @param countyFips County query parameter. To get a list of available counties, | ||
#' run \code{check_param("counties")}. The "Fips" can be created using the function | ||
#' \code{countyCdLookup}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have a nice example of this in the vignette, but same general comment as above, in case someone does not go to the vignette.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. in all my muddling through this, the errors were very helpful, and I eventually got to where I needed. So that is definitely a nice aspect of this whole system of helper functions and lookup tables!
No description provided.