Let’s perform index standardization with the built-in data for Pacific cod.

  • The density units should be kg/km2.
  • Here, X and Y are coordinates in UTM zone 9.
glimpse(pcod)
#> Rows: 2,143
#> Columns: 12
#> $ year          <int> 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 20…
#> $ X             <dbl> 446.4752, 446.4594, 448.5987, 436.9157, 420.6101, 417.71…
#> $ Y             <dbl> 5793.426, 5800.136, 5801.687, 5802.305, 5771.055, 5772.2…
#> $ depth         <dbl> 201, 212, 220, 197, 256, 293, 410, 387, 285, 270, 381, 1…
#> $ density       <dbl> 113.138476, 41.704922, 0.000000, 15.706138, 0.000000, 0.…
#> $ present       <dbl> 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ lat           <dbl> 52.28858, 52.34890, 52.36305, 52.36738, 52.08437, 52.094…
#> $ lon           <dbl> -129.7847, -129.7860, -129.7549, -129.9265, -130.1586, -…
#> $ depth_mean    <dbl> 5.155194, 5.155194, 5.155194, 5.155194, 5.155194, 5.1551…
#> $ depth_sd      <dbl> 0.4448783, 0.4448783, 0.4448783, 0.4448783, 0.4448783, 0…
#> $ depth_scaled  <dbl> 0.3329252, 0.4526914, 0.5359529, 0.2877417, 0.8766077, 1…
#> $ depth_scaled2 <dbl> 0.11083919, 0.20492947, 0.28724555, 0.08279527, 0.768440…

First we will create our SPDE mesh. We will use 100 knots for a balance between speed and accuracy in this vignette. You will likely want to use more for applied scenarios. You will want to make sure that increasing the number of knots does not change the conclusions.

pcod_spde <- make_mesh(pcod, c("X", "Y"), cutoff = 10)
plot(pcod_spde)

Let’s fit a GLMM. Note that if we want to use this model for index standardization then we need to include 0 + as.factor(year) or -1 + as.factor(year) so that we have a factor predictor that represents the mean estimate for each time slice.

m <- sdmTMB(
  data = pcod, 
  formula = density ~ 0 + as.factor(year),
  time = "year", spde = pcod_spde, family = tweedie(link = "log"))

We can inspect randomized quantile residuals::

pcod$resids <- residuals(m) # randomized quantile residuals
hist(pcod$resids)

qqnorm(pcod$resids)
abline(a = 0, b = 1)

ggplot(pcod, aes(X, Y, col = resids)) + scale_colour_gradient2() +
  geom_point() + facet_wrap(~year) + coord_fixed()

Now we want to predict on a fine-scale grid on the entire survey domain. There is a grid built into the package for Queen Charlotte Sound named qcs_grid. Our prediction grid also needs to have all the covariates that we used in the model above.

glimpse(qcs_grid)
#> Rows: 65,826
#> Columns: 6
#> $ X             <dbl> 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 4…
#> $ Y             <dbl> 5636, 5636, 5636, 5636, 5636, 5636, 5636, 5636, 5636, 56…
#> $ depth         <dbl> 347.08345, 223.33479, 203.74085, 183.29868, 182.99983, 1…
#> $ depth_scaled  <dbl> 1.56081222, 0.56976987, 0.36336929, 0.12570465, 0.122036…
#> $ depth_scaled2 <dbl> 2.436134789, 0.324637708, 0.132037240, 0.015801658, 0.01…
#> $ year          <int> 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 20…

Now make the predictions on new data.

We will set the area argument to 4 km2 since our grid cells are 2 km x 2 km. If some grid cells were not fully in the survey domain (or were on land), we could feed a vector of grid areas to the area argument that matched the number of grid cells.

predictions <- predict(m, newdata = qcs_grid, return_tmb_object = TRUE, area = 4)

Let’s make a small function to make maps.

plot_map <- function(dat, column) {
  ggplot(dat, aes_string("X", "Y", fill = column)) +
    geom_raster() +
    facet_wrap(~year) +
    coord_fixed()
}

There are four kinds of predictions that we get out of the model. First we will show the predictions that incorporate all fixed effects and random effects:

plot_map(predictions$data, "exp(est)") +
  scale_fill_viridis_c(trans = "sqrt") +
  ggtitle("Prediction (fixed effects + all random effects)")

We can also look at just the fixed effects, here year:

plot_map(predictions$data, "exp(est_non_rf)") +
  ggtitle("Prediction (fixed effects only)") +
  scale_fill_viridis_c(trans = "sqrt")

We can look at the spatial random effects that represent consistent deviations in space through time that are not accounted for by our fixed effects. In other words, these deviations represent consistent biotic and abiotic factors that are affecting biomass density but are not accounted for in the model.

plot_map(predictions$data, "omega_s") +
  ggtitle("Spatial random effects only") +
  scale_fill_gradient2()

And finally we can look at the spatiotemporal random effects that represent deviation from the fixed effect predictions and the spatial random effect deviations. These represent biotic and abiotic factors that are changing through time and are not accounted for in the model.

plot_map(predictions$data, "epsilon_st") +
  ggtitle("Spatiotemporal random effects only") +
  scale_fill_gradient2()

When we ran our predict.sdmTBM() function, it also returned a report from TMB in the output because we included return_tmb_object = TRUE. We can then run our get_index() function to extract the total biomass calculations and standard errors.

# not bias correcting for vignette-building speed:
index <- get_index(predictions, bias_correct = FALSE)
ggplot(index, aes(year, est)) + geom_line() +
  geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.4) +
  xlab('Year') + ylab('Biomass estimate (metric tonnes)')

These are our biomass estimates:

mutate(index, cv = sqrt(exp(se^2) - 1)) %>% 
  select(-log_est, -max_gradient, -bad_eig, -se) %>%
  knitr::kable(format = "pandoc", digits = c(0, 0, 0, 0, 2))
year est lwr upr cv
2003 660609 461278 946075 0.18
2004 1352136 1002971 1822855 0.15
2005 1331649 927702 1911486 0.19
2007 317351 230784 436389 0.16
2009 499248 358191 695853 0.17
2011 992276 751832 1309617 0.14
2013 1015516 740745 1392210 0.16
2015 1073664 805749 1430662 0.15
2017 528919 383349 729767 0.17

We can also calculate an index for part of the survey domain. We’ll make an index for everything south of UTM 5700 by subsetting our prediction grid. For more complicated spatial polygons you could intersect the polygon on the prediction grid using something like sf::st_intersects().

qcs_grid_south <- qcs_grid[qcs_grid$Y < 5700, ]
predictions_south <- predict(m, newdata = qcs_grid_south, 
  return_tmb_object = TRUE, area = 2 * 2)
index_south <- get_index(predictions_south)
head(index_south)
#>   year      est       lwr      upr  log_est        se max_gradient bad_eig
#> 1 2003 185740.6 109731.97 314398.4 12.13211 0.2685305  0.005347066   FALSE
#> 2 2004 436311.8 291882.29 652208.2 12.98611 0.2051092  0.005347066   FALSE
#> 3 2005 289067.0 184739.70 452310.5 12.57441 0.2284279  0.005347066   FALSE
#> 4 2007 130547.5  83008.93 205311.1 11.77949 0.2310190  0.005347066   FALSE
#> 5 2009 226420.5 145660.83 351956.3 12.33015 0.2250618  0.005347066   FALSE
#> 6 2011 328543.1 222636.61 484828.6 12.70242 0.1985380  0.005347066   FALSE

We can visually compare the two indexes:

mutate(index, region = "all") %>% 
  bind_rows(mutate(index_south, region = "south")) %>% 
  ggplot(aes(year, est)) + 
  geom_line(aes(colour = region)) +
  geom_ribbon(aes(ymin = lwr, ymax = upr, fill = region), alpha = 0.4)