Skip to content

Combine acoustic and bottom trawl data

Cole Monnahan edited this page May 19, 2021 · 3 revisions

It is possible to use VAST to fit models that include samples of different categories (species, sizes, vertical layers, etc), but where some samples are "unlabeled" or "partially labeled" (i.e., apply to the sum of density for more than one individual category). This arises naturally in many situations, e.g.:

  • where individuals are sometimes identified to species and other times to genus (such that the density for each genus is the sum of densities across modeled species);
  • when ages are subsampled for some bottom-trawl tows but not others;
  • when animal sex is known for some animals and not others in a two-sex model, etc

This method is demonstrated in:

Cole C Monnahan, James T Thorson, Stan Kotwicki, Nathan Lauffenburger, James N Ianelli, Andre E Punt, Incorporating vertical distribution in index standardization accounts for spatiotemporal availability to acoustic and bottom trawl gear for semi-pelagic species, ICES Journal of Marine Science, 2021;, fsab085, https://doi.org/10.1093/icesjms/fsab085

The demo below is adapted from that paper and shows fits to simulated data for fish biomass in three vertical layers:

  1. The acoustic dead zone (i.e., 0 to 0.5/3 meters above substrate)
  2. Above ADZ and below effective fishing height for vertical herding into a bottom trawl;
  3. Effective fishing height to ocean surface.

The scenario involves data from a bottom trawl (which samples the total biomass in layers 1-2) and an echsounder acoustic survey (which samples biomass in layer 2 and separately in layer 3). The data are simulated based loosely on eastern Bering Sea pollock.

library(VAST)

## Read in prepared data which comes from a simulated example
## loosely conditioned on EBS pollock.

### !! THIS IS NOT REAL DATA !! ###
data(acoustic_and_trawl, package = "FishStatsUtils" )
dat <- subset(acoustic_and_trawl, Year<2012)
str(dat)       # note Gear reflects different data sets
## Make default settings for index standardization
settings <- make_settings(
         n_x = 100,
         Region = "eastern_bering_sea",
         purpose = "index2",
         bias.correct = FALSE,
         fine_scale = TRUE)
## settings$Options$treat_nonencounter_as_zero <- FALSE
settings$FieldConfig[2,2] <- 0 # turn off spatiotemporal LP2

### Setup the "combined" part of the model
## Associate depth strata (0 = <0.5m, 1 = 0.5-16m, 2 = >16m) with each
## observation (row). This is two columns b/c there are
## observations from at most 2 strata.
c_iz <- matrix( c(1,2, 2,NA, 3,NA),
     byrow = TRUE,
     nrow = 3,
     ncol = 2)[as.numeric(dat$Gear)] - 1
unique(c_iz)
## [1,]    0    1  # bottom trawl
## [2,]    1   NA  # acoustic 0.5-16m
## [3,]    2   NA  # acoustic >16m
### Note that NA means to not sum across it. So the bottom trawl
### is the summation over the first two strata, and the two
### acoustic data sets are only from one stratum.

## Run model, with getReportCovariance = TRUE so the Delta method
## can be used below
fit <- fit_model(
    settings = settings,
    Lat_i = dat$Lat,
    Lon_i = dat$Lon,
    t_i = dat$Year,
    c_i = c_iz,
    b_i = dat$Catch_KG,
    a_i = dat$AreaSwept_km2,
    getReportCovariance = TRUE)

## Default plots
plot(fit,
    category_names = c("Acoustic dead zone","ADZ to Effective fishing height","EFH to surface") )

## Can grab index SEs directly from VAST output in the case of
## the depth strata (VAST categories)
years <- sort(unique(dat$Year))
nyr <- length(years)
SD <- fit$parameter_estimates$SD
tmp <- which(names(SD$value) %in% 'Index_ctl')
est <- fit$Report$Index_ctl[,,1]
index.strata <- data.frame(
             year = rep(years, each = 3)+c(-.05, 0, .05),
             stratum = c('<0.5 m', '0.5-16 m', '>16 m'),
             index = SD$value[tmp],
             se = SD$sd[tmp])


## But need to manually calculate SE for the total biomass index
## and two gear indices. Since it's a sum of the three the
## derivatives are all 1 and so the SE is the sqrt of the sum of
## all of the variances and covariances. This feature is not
## coded into VAST yet so have to do it manually. also note that
## the order of the Index_cyl matrix in vector form is Index_11,
## Index_21, Index_31, Index_12,.. etc. This effects the
## subsetting below
cov.index <- SD$cov[tmp,tmp]
## combined is all three strata summed
index1 <- data.frame( year = years-.05,
          gear = 'total',
          index = apply(est[1:3,], 2, sum),
          se = sqrt(sapply(1:nyr, function(i) {j = 1:3+3*(i-1); sum(cov.index[j,j])})))
## sum the top two to get what the ATS sees
index2 <- data.frame( year = years,
          gear = 'AT',
          index = apply(est[2:3,], 2, sum),
          se = sqrt(sapply(1:nyr, function(i) {j = (1:3+3*(i-1))[-1]; sum(cov.index[j,j])})))
## likewise the BTS is just the first strata
index3 <- data.frame( year = years+.05,
          gear = 'BT',
          index = apply(est[1:2,], 2, sum),
          se = sqrt(sapply(1:nyr, function(i) {j = (1:3+3*(i-1))[-3]; sum(cov.index[j,j])})))
index.gear <- rbind(index1,index2, index3)

if(require(ggplot2)){
  ggplot(index.strata,
    aes(year, index, color = stratum, ymin = index-1.96*se,
                           ymax = index+1.96*se)) + geom_linerange(lwd = 1) +
    geom_line(lwd = 2)
  ggplot(index.gear,
    aes(year, index, color = gear, ymin = pmax(0,index-1.96*se), ymax = index+1.96*se)) +
    geom_linerange(lwd = 1) + geom_line(lwd = 2)
}

## Further applications of the Delta method could be used to get
## other derived quantities in a similar manner.
Clone this wiki locally