-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
22 changed files
with
332 additions
and
78 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
{ | ||
"activeTab": 0 | ||
"activeTab": -1 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1 @@ | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/.github/workflows/R-CMD-check-all.yml="1A84B01C" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/.github/workflows/R-CMD-check-month-test.yml="D4BBA76F" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/.gitignore="ECF04EF5" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/NAMESPACE="79356337" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/R/conv_function.R="99CB140E" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/README.md="F2158441" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/inst/examples/get_range_help.R="C765AE8E" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/inst/examples/make_ecoregion_help.R="DBD0841D" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/man/get_gbif.Rd="5FF8208F" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/man/get_range.Rd="D136AD48" | ||
C:/Users/am92guke/Documents/iDiv/code/gbif_range/gbif.range/man/make_ecoregion.Rd="D4D98882" | ||
D:/OneDrive - Eidg. Forschungsanstalt WSL/gitlab/gbif.range/DESCRIPTION="D306FE16" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,180 @@ | ||
### ========================================================================= | ||
### cv blocks | ||
### ========================================================================= | ||
#' Block-wise split data into training and testing | ||
#' | ||
#' Creates a stratum vector based on a data.frame with n columns. If the data.frame | ||
#' has one column, strata vector is created based on clusters separated by quantiles. If | ||
#' the data.frame has two or more columns, strata vector is created based on 'Clustering | ||
#' Large Applications' (function 'clara' from package cluster). Also, instead of a data.frame | ||
#' the argument 'npoints' can be provided, which create groups by random sampling. An | ||
#' opitimization algorithm (function 'gridSearch' from package NMOF) optimizes for equal stratum sizes. | ||
#' | ||
#' @param nfolds Numeric. Number of approximately equal-sized classes (folds) to separate groups in block-cross validation | ||
#' @param df Object of class 'data.frame' with n columns containing critera for cluster building. | ||
#' Not necessary if argument npoints is supplied | ||
#' @param nblocks Number of clusters (blocks) based on the number of folds that should be built. | ||
#' Minimum is the same number as 'nFolds'. Maximum is nrow(df)/10. | ||
#' @param pres Binary vector. Optional argument. If 'df' is supplied, this argument can be used to | ||
#' save processing time. '1' stands for the points on which CLARA is appplied (most likely the species | ||
#' observations), and '0' stands for the points on which K-nearest neighbors is applied relative to | ||
#' the '1' (most likely the absences, background points...). If 'df' is not supplied, for which points | ||
#' should random sampling be made? | ||
#' @param npoints Optional argument if 'df' is not supplied. For how many points should random sampling be made? | ||
#' @return Object of class 'vector' of length nrow(df) or 'npoints', with integers defining | ||
#' different folds | ||
#' @references | ||
#' Brun, P., Thuiller, W., Chauvier, Y., Pellissier, L., Wüest, R. O., Wang, Z., & Zimmermann, N. E. (2020). | ||
#' Model complexity affects species distribution projections under climate change. | ||
#' Journal of Biogeography, 47(1), 130-142. | ||
#' @author Philipp Brun | ||
#' @example inst/examples/make_blocks_help.R | ||
#' @export | ||
#' @importFrom cluster clara | ||
#' @importFrom class knn | ||
#' @importFrom NMOF gridSearch | ||
make_blocks<-function(nfolds = 4, | ||
df = data.frame(), | ||
nblocks = nfolds*5, | ||
npoints = NA, | ||
pres = numeric()){ | ||
|
||
### ------------------------ | ||
### check input data | ||
### ------------------------ | ||
|
||
if (nrow(df)==0 & is.na(npoints)){ | ||
stop("Please supply number of points if no data.frame is supplied") | ||
} | ||
|
||
### ------------------------ | ||
### generate clusters | ||
### ------------------------ | ||
|
||
if (nrow(df)==0){ | ||
|
||
### do ordinary sampling if no strata are supplied | ||
out.strat=sample(rep(1:nfolds,ceiling(npoints/nfolds)),size=npoints) | ||
|
||
} else { | ||
|
||
# check for reasonable number of boxes | ||
if (nrow(df)<4*nblocks){ | ||
stop("Too many boxes required!") | ||
} | ||
|
||
if (ncol(df)==1){ | ||
|
||
### do quantile-based clustering if df contains only one column | ||
rngi = quantile(df[,1],probs=0:(nblocks)/(nblocks)) | ||
rngi[1] = rngi[1]-1 | ||
rngi[length(rngi)] = rngi[length(rngi)]+1 | ||
clist = as.numeric(cut(df[,1],breaks=rngi,right=TRUE)) | ||
|
||
} else { | ||
|
||
# Scale input data | ||
scd = apply(df,2,scale) | ||
|
||
if (length(pres)==0){ | ||
|
||
# do kmedoid clustering for 2 or more columns in df | ||
kmed = clara(scd,k=nblocks,metric="euclidean") | ||
|
||
# get clusters | ||
clist = kmed$clustering | ||
|
||
} else { | ||
|
||
# do kmedoid clustering for 2 or more columns in df | ||
kmed = clara(scd[which(pres==1),],k=nblocks,metric="manhattan") | ||
knnab = knn(train=scd[which(pres==1),],test=scd[which(pres==0),],cl=kmed$clustering) | ||
clist = kmed$clustering | ||
cliful = rep(NA,nrow(scd)) | ||
cliful[which(pres==1)] = kmed$clustering | ||
cliful[which(pres==0)] = as.numeric(knnab) | ||
} | ||
} | ||
|
||
# sort obtained clusters | ||
tbl = sort(table(clist),decreasing = TRUE) | ||
|
||
### ------------------------ | ||
### regularly assign clusters to strata | ||
### ------------------------ | ||
|
||
if (nblocks != nfolds){ | ||
|
||
# prepare strata layers | ||
grps = rep(list(numeric()),nfolds) | ||
|
||
# for the clusters with many observations | ||
# distribute them regularly among strata but keep | ||
# last six clusters for estimating most | ||
# regular distribution | ||
|
||
if (length(tbl)>6){ | ||
|
||
for (i in 1:(length(tbl)-6)){ | ||
|
||
# determine to which stratum the cluster should be | ||
# added | ||
fl=(floor((i-1)/nfolds)) | ||
if (fl%%2==0){ | ||
|
||
j = round(1+nfolds*((i-1)/nfolds-(floor((i-1)/nfolds)))) | ||
|
||
} else { | ||
|
||
j = (nfolds+1)-round(1+nfolds*((i-1)/nfolds-(floor((i-1)/nfolds)))) | ||
} | ||
# add cluster | ||
grps[[j]] = append(grps[[j]],tbl[i]) | ||
} | ||
} | ||
|
||
# prepare for optimal distribution of last 6 clusters | ||
vlis = factor(1:nfolds,levels=1:nfolds) | ||
prs = rep(list(vlis),min(length(tbl),6)) | ||
sstab = tbl[max(1,(length(tbl)-5)):length(tbl)] | ||
|
||
# Run brute-forcing gridSearch obtimization | ||
srch = gridSearch(levels=prs,fun=optme,nms=as.vector(sstab),grps=grps,tot=sum(tbl)) | ||
|
||
# pull out results | ||
wi = as.numeric(as.character(srch$minlevels)) | ||
|
||
# combine results with predistributed clusters | ||
for (i in 1:length(grps)){ | ||
grps[[i]] = append(grps[[i]],sstab[wi==i]) | ||
} | ||
|
||
# define vector with output strata | ||
out.strat = rep(NA,nrow(df)) | ||
for (i in 1:length(grps)){ | ||
|
||
if (length(pres)==0){ | ||
|
||
out.strat[which(as.character(clist)%in%names(grps[[i]]))] = i | ||
|
||
} else { | ||
|
||
out.strat[which(as.character(cliful)%in%names(grps[[i]]))] = i | ||
} | ||
} | ||
|
||
} else { | ||
|
||
# if as many strata as clusters are required, simply return clusters | ||
if (length(pres)==0){ | ||
|
||
out.strat = clist | ||
} else{ | ||
|
||
out.strat = cliful | ||
} | ||
} | ||
} | ||
# return result | ||
return(out.strat) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
# Get the taxonomy and IUCN status of Panthera tigris (with and without related names) | ||
get_status("Cypripedium calceolus",all=FALSE) | ||
get_status("Cypripedium calceolus",all=TRUE) | ||
get_status("Cypripedium calceolus", all = FALSE) | ||
get_status("Cypripedium calceolus", all = TRUE) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
\dontrun{ | ||
# Downloading worldwide the observations of Panthera tigris | ||
obs.pt <- get_gbif("Panthera tigris", | ||
basis = c("OBSERVATION", | ||
"HUMAN_OBSERVATION", | ||
"MACHINE_OBSERVATION")) | ||
|
||
# Create a vector of folds (n = 5) spatially blocked (n = 10) | ||
block.pt <- make_blocks(nfolds = 5, df = obs.pt[,c("decimalLatitude","decimalLongitude")], nblocks = 10) | ||
|
||
# Plot one colour per fold | ||
countries <- terra::vect(rnaturalearth::ne_countries(type = "countries", returnclass = "sf")) | ||
countries.focus <- crop(countries, ext(60,100,0,40)) | ||
terra::plot(countries.focus, col = "#bcbddc") | ||
graphic::points(obs.pt[,c("decimalLongitude","decimalLatitude")], pch = 20, col = block.pt, cex = 1) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,21 @@ | ||
#TODO transfer to data() | ||
# Open | ||
rst.path <- paste0(system.file(package = "gbif.range"),"/extdata/rst_enl.tif") | ||
rst <- terra::rast(rst.path) | ||
shp.path <- paste0(system.file(package = "gbif.range"),"/extdata/shp_lonlat.shp") | ||
shp.lonlat <- terra::vect(shp.path) | ||
rst <- terra::crop(rst, shp.lonlat) | ||
#plot(crp) | ||
|
||
# Apply the function by infering 50 classes of environments | ||
my.eco <- make_ecoregion(rst,50) | ||
terra::plot(my.eco) | ||
|
||
# Downloading in the European Alps the observations of one plant species | ||
obs.arcto <- get_gbif("Arctostaphylos alpinus",geo=shp.lonlat) | ||
obs.arcto <- get_gbif("Arctostaphylos alpinus", geo = shp.lonlat) | ||
|
||
# Create the range map based on our custom ecoregion | ||
range.arcto <- get_range(obs.arcto, my.eco, "EcoRegion", res=20) | ||
range.arcto <- get_range(obs.arcto, my.eco, "EcoRegion", res = 20) | ||
|
||
# Plot | ||
terra::plot(shp.lonlat, col="grey") | ||
terra::plot(range.arcto,add=TRUE,col=rgb(0.2,1,0.2,0.5,1)) | ||
graphics::points(obs.arcto[,c("decimalLongitude","decimalLatitude")],pch=20,col="orange1",cex=1) | ||
terra::plot(shp.lonlat, col = "grey") | ||
terra::plot(range.arcto, add = TRUE, col = rgb(0.2,1,0.2,0.5,1)) | ||
graphics::points(obs.arcto[,c("decimalLongitude","decimalLatitude")], pch = 20, col = "orange1", cex = 1) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.