This packages contains functions to perform geographic query of several biodiversity databases based on U.S. Fish and Wildlife Service (USFWS) administrative or acquisition property boundaries (plus optional buffers). At present, the package queries the Global Biodiversity Information Facility (GBIF), Biodiversity Information Serving Our Nation (BISON), Integrated Digitized Biocollections (iDigBio), and VertNet. It draws heavily from the outstanding work by the ROpenScience group and their suite of species occurrence packages.
We provide options to:
- scrub records to reduce the number of returned records for each refuge (see scrubbing details below);
- update scientific names to their current accepted or valid form based on the Integrated Taxonomic Information System (ITIS) (see taxonomic details below).
The fwspp
package requires R (>= 3.4)
and Rtools. Both will
require administrative privileges but the subsequent installation of
packages will not.
With R and Rtools installed, it’s relatively painless to install and
load the fwspp
package. If you receive an SSL or CA Certificate error,
take the extra steps documented below.
# If devtools package is not installed
install.packages("devtools", dependencies = TRUE)
# Now install fwspp
devtools::install_github("git@github.com:USFWS/fwspp.git", ref = "main")
# Now load the functionality
library("fwspp")
Prior to your first use of fwspp
, you’ll need to install the USFWS
Cadastral Geodatabase. You can do this now, using the
install_fws_cadastral
function:
install_fws_cadastral()
or later, when you’ll be prompted by the fws_occ
function. The process
takes 10 minutes or so to complete.
The fwspp
package exists strictly to extract occurrence data on USFWS
properties, usually National Wildlife Refuges. Attempts to estimate or
infer relative abundance are most strongly discouraged and almost
certainly meaningless.
Extracting species observation data is essentially a three-step process:
-
Decide on the USFWS properties to query
-
Decide on the boundary, scrubbing, and output parameters of that query
-
Run the query
Okay, four steps:
-
Wait… probably a long while…
The easiest way to generate a list (actually a data.frame
) of USFWS
properties to query is to use the find_fws
function. With find_fws
you can restrict your search of USFWS properties to a particular USFWS
region, particular USFWS property types
(see below), and use regular expressions to make the string matching as
strict as necessary. Here are a few examples:
# Get all National Wildlife Refuges (527 as of 2017-11-10)
# Issues a warning because of two identically-named, but distinct, refuges
all_refs <- find_fws()
# Search across all refuges with a partial name match
# Only 1 refuge contains 'longleaf'; Mountain Longleaf NWR
(ml <- find_fws("longleaf"))
#> ORGNAME FWSREGION RSL_TYPE
#> 1 MOUNTAIN LONGLEAF NATIONAL WILDLIFE REFUGE 4 NWR
#> geom
#> 1 POLYGON ((-85.74307 33.6580...
# Search across all refuges matching multiple strings
(multi <- find_fws(c("longleaf", "romain")))
#> ORGNAME FWSREGION RSL_TYPE
#> 1 CAPE ROMAIN NATIONAL WILDLIFE REFUGE 4 NWR
#> 2 MOUNTAIN LONGLEAF NATIONAL WILDLIFE REFUGE 4 NWR
#> geom
#> 1 MULTIPOLYGON (((-79.64054 3...
#> 2 POLYGON ((-85.74307 33.6580...
# Could also have used regular expression
(multi <- find_fws("longleaf|romain"))
#> ORGNAME FWSREGION RSL_TYPE
#> 1 CAPE ROMAIN NATIONAL WILDLIFE REFUGE 4 NWR
#> 2 MOUNTAIN LONGLEAF NATIONAL WILDLIFE REFUGE 4 NWR
#> geom
#> 1 MULTIPOLYGON (((-79.64054 3...
#> 2 POLYGON ((-85.74307 33.6580...
# Here's a tricky one - Hatchie NWR
# Simple search returns three refuges with 'hatchie' in name
(hatch <- find_fws("hatchie"))
#> ORGNAME FWSREGION RSL_TYPE
#> 1 HATCHIE NATIONAL WILDLIFE REFUGE 4 NWR
#> 2 LOWER HATCHIE NATIONAL WILDLIFE REFUGE 4 NWR
#> 3 TALLAHATCHIE NATIONAL WILDLIFE REFUGE 4 NWR
#> geom
#> 1 MULTIPOLYGON (((-89.25235 3...
#> 2 MULTIPOLYGON (((-89.59588 3...
#> 3 MULTIPOLYGON (((-89.86355 3...
# Option 1: select the one you need ad hoc
(hatch <- hatch[1, ])
#> ORGNAME FWSREGION RSL_TYPE
#> 1 HATCHIE NATIONAL WILDLIFE REFUGE 4 NWR
#> geom
#> 1 MULTIPOLYGON (((-89.25235 3...
# Option 2: regular expressions
(hatch <- find_fws("^hatchie")) # Make name start with 'hatchie'
#> ORGNAME FWSREGION RSL_TYPE
#> 1 HATCHIE NATIONAL WILDLIFE REFUGE 4 NWR
#> geom
#> 1 MULTIPOLYGON (((-89.25235 3...
# Return all southeast (region 4) refuges
r4_refs <- find_fws(region = 4)
nrow(r4_refs)
#> [1] 130
# Return all mountain-prairie (region 6) refuges and waterfowl production areas
r6_all <- find_fws(ptype = c("NWR", "WPA"), region = 6)
The find_fws
function gives you a ptype
argument to search for
several USFWS property types. The most common and default property type
is the National Wildlife Refuge (NWR), but other options include
Waterfowl Production Areas (WPA), Wildlife Management Areas (WMA),
National Fish Hatcheries (NFH), Wildlife Management Areas (WMA), and
Farm Service Agency indices (FSA). You can specify multiple options as
illustrated in the last example above.
We offer two options for querying the boundaries of refuges and other
USFWS properties via the bnd
argument to the fws_occ
function. The
default bnd = "admin"
queries those lands and waters administered
by the USFWS in North America, U.S. Trust Territories and Possessions.
It may also include inholdings that are not administered by the USFWS.
The primary source for this information is the USFWS Realty program. See
https://ecos.fws.gov/ServCat/Reference/Profile/82894 for more
information. Using bnd = "acq"
queries the external boundaries of
lands and waters that are approved for acquisition by the USFWS in
North America, U.S. Trust Territories and Possessions. See
https://ecos.fws.gov/ServCat/Reference/Profile/82893 for more
information.
By default, we scrub a lot of records (scrub = "strict"
).
Specifically, we endeavor to retain, for a given geometry, a single
record for each species. We attempt to preferentially retain
observations with a URL that best substantiates the observation (i.e.,
the “best” evidence). We rank evidence in the following order: (1) URL
to observation with media (photo, audio, video) or the media itself, (2)
URL to the observation in the original collection, (3) URL of the
collection, with catalog number, or (4) URL of the institution housing
the collection. We do not retain any records for which evidence was not
available (i.e., no associated collection or catalog number). We
optionally offer a less restrictive scrubbing option
(scrub = "moderate"
) that attempts only to eliminate records sharing
the same catalog number and redundant observations (i.e., multiple
individuals of the same species recorded on the same date at a single
location). Users can also disable scrubbing altogether
(scrub = "none"
).
By default (taxonomy = TRUE
), we attempt to validate scientific names
against the Integrated Taxonomic Information System
(ITIS). It does this not by connecting to ITIS
directly, but by requesting information from a REST web service
maintained by the National Park Service (NPS) as part of their
NPSpecies database. Note that this
means if taxonomy information is requested, and an ITIS match found, the
scientific name will be converted to the “accepted” ITIS scientific name
(if it wasn’t already), and the corresponding ITIS Taxonomic Serial
Number, common names, NPS-specific taxon code, and a general taxa
“category” (e.g., Mammals, Birds, Fungi) designated by the NPS are
returned. Modifications to observation taxonomy can be suppressed with
taxonomy = FALSE
.
By default, the fws_occ
function makes the query using the actual
property boundary, either administrative or acquisition. However, the
buffer
argument provides the option to expand the query of occurrence
records beyond the property boundary into a user-specified buffer (in
kilometers). This may be useful for very small properties in which
species observations in the adjacent areas may provide a good indication
of what is present, or expected to be present, within the boundary. By
design, fws_occ
generates a lot of messaging as properties are
processed; we like to see that things are moving along. If this annoys
you, specify verbose = FALSE
and enjoy a very slowly updating progress
bar and limited messaging.
With USFWS properties and query options identified, all that’s left is
to run the fws_occ
function, passing the object containing the
properties to query generated by find_fws
and specifying any changes
from the defaults to the bnd
, buffer
, scrub
, taxonomy
, and
verbose
options.
Some examples:
# Query all southeast refuges (NWR only), using defaults
r4 <- find_fws(region = 4)
r4_occ <- fws_occ(r4)
# Query mountain-prairie refuges and waterfowl production areas
# Suppress most messaging & replace with progress bar
r6 <- find_fws(region = 6, ptype = c("NWR", "WPA"))
r6_occ <- fws_occ(r6, verbose = FALSE)
# Query Key West NWR with 10 km buffer; suppressing taxonomy changes
kw <- find_fws("key west")
kw_occ <- fws_occ(kw, buffer = 10, taxonomy = FALSE)
Querying many properties can take hours, particularly if they are
relatively large or contain hundreds of thousands of records. Typically
the best option is to set fws_occ
off and running in the background
(or overnight) and do something more productive with yourself…
Contact the project maintainer for help with this repository.
Contact the project maintainer for information about contributing to this repository. Submit a GitHub Issue to report a bug or request a feature or enhancement.