Make predictions from an sdmTMB model. The function can predict on the original or new data.

# S3 method for sdmTMB
predict(
  object,
  newdata = object$data,
  se_fit = FALSE,
  return_tmb_object = FALSE,
  area = 1,
  re_form = NULL,
  re_form_iid = NULL,
  sims = 0,
  tmbstan_model = NULL,
  ...
)

Arguments

object

An object from sdmTMB().

newdata

A data frame to make predictions on. This should be a data frame with the same predictor columns as in the fitted data and a time column (if this is a spatiotemporal model) with the same name as in the fitted data. There should be predictor data for each year in the original data set.

se_fit

Should standard errors on predictions at the new locations given by newdata be calculated? Warning: the current implementation can be very slow for large data sets or high-resolution projections. A much faster option is to use the sims argument below and calculate uncertainty on the simulations from the joint precision matrix.

return_tmb_object

Logical. If TRUE, will include the TMB object in a list format output. Necessary for the get_index() or get_cog() functions.

area

A vector of areas for survey grid cells. Only necessary if the output will be passed to get_index() or get_cog() and not all grid cells are of area 1. Should be the same length as the number of rows of newdata. If length 1, will be repeated to match the rows of data.

re_form

NULL to specify including all spatial/spatiotemporal random effects in predictions. ~0 or NA for population-level predictions. Note that unlike lme4 or glmmTMB, this only affects what the standard errors are calculated on if se_fit = TRUE. This does not affect get_index() calculations.

re_form_iid

NULL to specify including all random intercepts in the predictions. ~0 or NA for population-level predictions. No other options (e.g., some but not all random intercepts) are implemented yet. Only affects predictions with newdata. This also affects get_index().

sims

If > 0, simulate from the joint precision matrix with sims draws Returns a matrix of nrow(data) by sim representing the estimates of the linear predictor (i.e., in link space). Can be useful for deriving uncertainty on predictions (e.g., apply(x, 1, sd)) or propagating uncertainty. This is currently the fastest way to generate estimates of uncertainty on predictions in space with sdmTMB.

tmbstan_model

A model fit with tmbstan::tmbstan(). See extract_mcmc() for more details and an example. If specificed, the predict function will return a matrix of a similar form as if sims > 0 but representing Bayesian posterior samples from the Stan model.

...

Not implemented.

Value

If return_tmb_object = FALSE (and sims = 0 and tmbstan_model = NULL): A data frame:

  • est: Estimate in link space (everything is in link space)

  • est_non_rf: Estimate from everything that isn't a random field

  • est_rf: Estimate from all random fields combined

  • omega_s: Spatial (intercept) random field that is constant through time

  • zeta_s: Spatial slope random field

  • epsilon_st: Spatiotemporal (intercept) random fields (could be independent draws each year or AR1)

If return_tmb_object = TRUE (and sims = 0 and tmbstan_model = NULL): A list:

  • data: The data frame described above

  • report: The TMB report on parameter values

  • obj: The TMB object returned from the prediction run

  • fit_obj: The original TMB model object

You likely only need the data element as an end user. The other elements are included for other functions.

If sims > 0 or tmbstan_model is not NULL: A matrix:

  • Columns represent samples

  • Rows represent predictions with one row per row of newdata

Examples

if (require("ggplot2", quietly = TRUE) && inla_installed()) {

# We'll only use a small number of knots so this example runs quickly
# but you will likely want to use more in applied situations.

d <- pcod_2011

pcod_spde <- make_mesh(d, c("X", "Y"), cutoff = 30) # a coarse mesh for example speed
m <- sdmTMB(
 data = d, formula = density ~ 0 + as.factor(year) + depth_scaled + depth_scaled2,
 time = "year", spde = pcod_spde, family = tweedie(link = "log")
)

# Predictions at original data locations -------------------------------

predictions <- predict(m)
head(predictions)

predictions$resids <- residuals(m) # randomized quantile residuals


ggplot(predictions, aes(X, Y, col = resids)) + scale_colour_gradient2() +
  geom_point() + facet_wrap(~year)
hist(predictions$resids)
qqnorm(predictions$resids);abline(a = 0, b = 1)

# Predictions onto new data --------------------------------------------

qcs_grid_2011 <- subset(qcs_grid, year >= min(pcod_2011$year))
predictions <- predict(m, newdata = qcs_grid_2011)

# A short function for plotting our predictions:
plot_map <- function(dat, column = "est") {
  ggplot(dat, aes_string("X", "Y", fill = column)) +
    geom_raster() +
    facet_wrap(~year) +
    coord_fixed()
}

plot_map(predictions, "exp(est)") +
  scale_fill_viridis_c(trans = "sqrt") +
  ggtitle("Prediction (fixed effects + all random effects)")

plot_map(predictions, "exp(est_non_rf)") +
  ggtitle("Prediction (fixed effects and any time-varying effects)") +
  scale_fill_viridis_c(trans = "sqrt")

plot_map(predictions, "est_rf") +
  ggtitle("All random field estimates") +
  scale_fill_gradient2()

plot_map(predictions, "omega_s") +
  ggtitle("Spatial random effects only") +
  scale_fill_gradient2()

plot_map(predictions, "epsilon_st") +
  ggtitle("Spatiotemporal random effects only") +
  scale_fill_gradient2()

# Visualizing a marginal effect ----------------------------------------

nd <- data.frame(depth_scaled =
  seq(min(d$depth_scaled), max(d$depth_scaled), length.out = 100))
nd$depth_scaled2 <- nd$depth_scaled^2

# You'll need at least one time element. If time isn't also a fixed effect
# then it doesn't matter what you pick:
nd$year <- 2011L
p <- predict(m, newdata = nd, se_fit = TRUE, re_form = NA)
ggplot(p, aes(depth_scaled, exp(est),
  ymin = exp(est - 1.96 * est_se), ymax = exp(est + 1.96 * est_se))) +
  geom_line() + geom_ribbon(alpha = 0.4)

# Plotting marginal effect of a spline ---------------------------------

m_gam <- sdmTMB(
 data = d, formula = density ~ 0 + as.factor(year) + s(depth_scaled, k = 3),
 time = "year", spde = pcod_spde, family = tweedie(link = "log")
)
nd <- data.frame(depth_scaled =
  seq(min(d$depth_scaled), max(d$depth_scaled), length.out = 100))
nd$year <- 2011L
p <- predict(m_gam, newdata = nd, se_fit = TRUE, re_form = NA)
ggplot(p, aes(depth_scaled, exp(est),
  ymin = exp(est - 1.96 * est_se), ymax = exp(est + 1.96 * est_se))) +
  geom_line() + geom_ribbon(alpha = 0.4)

# Forecasting ----------------------------------------------------------
pcod_spde <- make_mesh(d, c("X", "Y"), cutoff = 15)

unique(d$year)
m <- sdmTMB(
  data = d, formula = density ~ 1,
  fields = "AR1", # using an AR1 to have something to forecast with
  extra_time = 2019L, # `L` for integer to match our data
  include_spatial = FALSE,
  time = "year", spde = pcod_spde, family = tweedie(link = "log")
)

# Add a year to our grid:
grid2019 <- qcs_grid_2011[qcs_grid_2011$year == max(qcs_grid_2011$year), ]
grid2019$year <- 2019L # `L` because `year` is an integer in the data
qcsgrid_forecast <- rbind(qcs_grid_2011, grid2019)

predictions <- predict(m, newdata = qcsgrid_forecast)
plot_map(predictions, "exp(est)") +
  scale_fill_viridis_c(trans = "log10")
plot_map(predictions, "epsilon_st") +
  scale_fill_gradient2()

# Estimating local trends ----------------------------------------------

pcod_spde <- make_mesh(pcod, c("X", "Y"), cutoff = 25)
m <- sdmTMB(data = pcod, formula = density ~ depth_scaled + depth_scaled2,
  spde = pcod_spde, family = tweedie(link = "log"),
  spatial_trend = TRUE, time = "year", spatial_only = TRUE)
p <- predict(m, newdata = qcs_grid)

plot_map(p, "zeta_s") +
  ggtitle("Spatial slopes") +
  scale_fill_gradient2()

plot_map(p, "est_rf") +
  ggtitle("Random field estimates") +
  scale_fill_gradient2()

plot_map(p, "exp(est_non_rf)") +
  ggtitle("Prediction (fixed effects only)") +
  scale_fill_viridis_c(trans = "sqrt")

plot_map(p, "exp(est)") +
  ggtitle("Prediction (fixed effects + all random effects)") +
  scale_fill_viridis_c(trans = "sqrt")
}

#> Warning: Detected a `s()` smoother. Smoothers are penalized in sdmTMB as
#> of version 0.0.19, but used to be unpenalized.
#> You no longer need to specify `k` since the degree of wiggliness
#> is determined by the data.