Skip to contents

Make predictions from an sdmTMB model; can predict on the original or new data.

Usage

# S3 method for sdmTMB
predict(
  object,
  newdata = object$data,
  type = c("link", "response"),
  se_fit = FALSE,
  re_form = NULL,
  re_form_iid = NULL,
  nsim = 0,
  sims_var = "est",
  model = c(NA, 1, 2),
  offset = NULL,
  tmbstan_model = NULL,
  return_tmb_object = FALSE,
  return_tmb_report = FALSE,
  return_tmb_data = FALSE,
  sims = deprecated(),
  area = deprecated(),
  ...
)

Arguments

object

A model fitted with sdmTMB().

newdata

A data frame to make predictions on. This should be a data frame with the same predictor columns as in the fitted data and a time column (if this is a spatiotemporal model) with the same name as in the fitted data. There should be predictor data for each year in the original data set.

type

Should the est column be in link (default) or response space?

se_fit

Should standard errors on predictions at the new locations given by newdata be calculated? Warning: the current implementation can be slow for large data sets or high-resolution projections unless re_form = NA (omitting random fields). A faster option to approximate point-wise uncertainty is often to use the nsim argument.

re_form

NULL to specify including all spatial/spatiotemporal random effects in predictions. ~0 or NA for population-level predictions. Likely to be used in conjunction with se_fit = TRUE. This does not affect get_index() calculations.

re_form_iid

NULL to specify including all random intercepts in the predictions. ~0 or NA for population-level predictions. No other options (e.g., some but not all random intercepts) are implemented yet. Only affects predictions with newdata. This does affects get_index().

nsim

Experimental: If > 0, simulate from the joint precision matrix with nsim draws. Returns a matrix of nrow(data) by nsim representing the estimates of the linear predictor (i.e., in link space). Can be useful for deriving uncertainty on predictions (e.g., apply(x, 1, sd)) or propagating uncertainty. This is currently the fastest way to generate estimates of uncertainty on predictions in space with sdmTMB.

sims_var

Experimental: Which TMB reported variable from the model should be extracted from the joint precision matrix simulation draws? Defaults to the link-space predictions. Options include: "omega_s", "zeta_s", "epsilon_st", and "est_rf" (as described below). Other options will be passed verbatim.

model

Type of prediction if a delta/hurdle model: NA returns the combined prediction from both components on the link scale for the positive component; 1 or 2 return the first or second model component only on the link or response scale depending on the argument type.

offset

A numeric vector of optional offset values. If left at default NULL, the offset is implicitly left at 0.

tmbstan_model

A model fit with tmbstan::tmbstan(). See extract_mcmc() for more details and an example. If specified, the predict function will return a matrix of a similar form as if nsim > 0 but representing Bayesian posterior samples from the Stan model.

return_tmb_object

Logical. If TRUE, will include the TMB object in a list format output. Necessary for the get_index() or get_cog() functions.

return_tmb_report

Logical: return the output from the TMB report? For regular prediction this is all the reported variables at the MLE parameter values. For nsim > 0 or when tmbstan_model is supplied, this is a list where each element is a sample and the contents of each element is the output of the report for that sample.

return_tmb_data

Logical: return formatted data for TMB? Used internally.

sims

Deprecated. Please use nsim instead.

area

Deprecated. Please use area in get_index().

...

Not implemented.

Value

If return_tmb_object = FALSE (and nsim = 0 and tmbstan_model = NULL):

A data frame:

  • est: Estimate in link space (everything is in link space)

  • est_non_rf: Estimate from everything that isn't a random field

  • est_rf: Estimate from all random fields combined

  • omega_s: Spatial (intercept) random field that is constant through time

  • zeta_s: Spatial slope random field

  • epsilon_st: Spatiotemporal (intercept) random fields, could be off (zero), IID, AR1, or random walk

If return_tmb_object = TRUE (and nsim = 0 and tmbstan_model = NULL):

A list:

  • data: The data frame described above

  • report: The TMB report on parameter values

  • obj: The TMB object returned from the prediction run

  • fit_obj: The original TMB model object

In this case, you likely only need the data element as an end user. The other elements are included for other functions.

If nsim > 0 or tmbstan_model is not NULL:

A matrix:

  • Columns represent samples

  • Rows represent predictions with one row per row of newdata

Examples


d <- pcod_2011
mesh <- make_mesh(d, c("X", "Y"), cutoff = 30) # a coarse mesh for example speed
m <- sdmTMB(
 data = d, formula = density ~ 0 + as.factor(year) + depth_scaled + depth_scaled2,
 time = "year", mesh = mesh, family = tweedie(link = "log")
)

# Predictions at original data locations -------------------------------

predictions <- predict(m)
head(predictions)
#> # A tibble: 6 × 17
#>    year     X     Y depth density present   lat   lon depth_mean depth…¹ depth…²
#>   <int> <dbl> <dbl> <dbl>   <dbl>   <dbl> <dbl> <dbl>      <dbl>   <dbl>   <dbl>
#> 1  2011  435. 5718.   241    245.       1  51.6 -130.       5.16   0.445   0.741
#> 2  2011  487. 5719.    52      0        0  51.6 -129.       5.16   0.445  -2.71 
#> 3  2011  490. 5717.    47      0        0  51.6 -129.       5.16   0.445  -2.93 
#> 4  2011  545. 5717.   157      0        0  51.6 -128.       5.16   0.445  -0.222
#> 5  2011  404. 5720.   398      0        0  51.6 -130.       5.16   0.445   1.87 
#> 6  2011  420. 5721.   486      0        0  51.6 -130.       5.16   0.445   2.32 
#> # … with 6 more variables: depth_scaled2 <dbl>, est <dbl>, est_non_rf <dbl>,
#> #   est_rf <dbl>, omega_s <dbl>, epsilon_st <dbl>, and abbreviated variable
#> #   names ¹​depth_sd, ²​depth_scaled

predictions$resids <- residuals(m) # randomized quantile residuals

library(ggplot2)
ggplot(predictions, aes(X, Y, col = resids)) + scale_colour_gradient2() +
  geom_point() + facet_wrap(~year)

hist(predictions$resids)

qqnorm(predictions$resids);abline(a = 0, b = 1)


# Predictions onto new data --------------------------------------------

qcs_grid_2011 <- subset(qcs_grid, year >= min(pcod_2011$year))
predictions <- predict(m, newdata = qcs_grid_2011)

# A short function for plotting our predictions:
plot_map <- function(dat, column = est) {
  ggplot(dat, aes(X, Y, fill = {{ column }})) +
    geom_raster() +
    facet_wrap(~year) +
    coord_fixed()
}

plot_map(predictions, exp(est)) +
  scale_fill_viridis_c(trans = "sqrt") +
  ggtitle("Prediction (fixed effects + all random effects)")


plot_map(predictions, exp(est_non_rf)) +
  ggtitle("Prediction (fixed effects and any time-varying effects)") +
  scale_fill_viridis_c(trans = "sqrt")


plot_map(predictions, est_rf) +
  ggtitle("All random field estimates") +
  scale_fill_gradient2()


plot_map(predictions, omega_s) +
  ggtitle("Spatial random effects only") +
  scale_fill_gradient2()


plot_map(predictions, epsilon_st) +
  ggtitle("Spatiotemporal random effects only") +
  scale_fill_gradient2()


# Visualizing a marginal effect ----------------------------------------

# See the visreg package or the ggeffects::ggeffect() function
# To do this manually:

nd <- data.frame(depth_scaled =
  seq(min(d$depth_scaled), max(d$depth_scaled), length.out = 100))
nd$depth_scaled2 <- nd$depth_scaled^2

# Because this is a spatiotemporal model, you'll need at least one time
# element. If time isn't also a fixed effect then it doesn't matter what you pick:
nd$year <- 2011L # L: integer to match original data
p <- predict(m, newdata = nd, se_fit = TRUE, re_form = NA)
ggplot(p, aes(depth_scaled, exp(est),
  ymin = exp(est - 1.96 * est_se), ymax = exp(est + 1.96 * est_se))) +
  geom_line() + geom_ribbon(alpha = 0.4)


# Plotting marginal effect of a spline ---------------------------------

m_gam <- sdmTMB(
 data = d, formula = density ~ 0 + as.factor(year) + s(depth_scaled, k = 5),
 time = "year", mesh = mesh, family = tweedie(link = "log")
)
if (require("visreg", quietly = TRUE)) { # just for help docs
  visreg::visreg(m_gam, "depth_scaled")
}


# or manually:
nd <- data.frame(depth_scaled =
  seq(min(d$depth_scaled), max(d$depth_scaled), length.out = 100))
nd$year <- 2011L
p <- predict(m_gam, newdata = nd, se_fit = TRUE, re_form = NA)
ggplot(p, aes(depth_scaled, exp(est),
  ymin = exp(est - 1.96 * est_se), ymax = exp(est + 1.96 * est_se))) +
  geom_line() + geom_ribbon(alpha = 0.4)


# Forecasting ----------------------------------------------------------
mesh <- make_mesh(d, c("X", "Y"), cutoff = 15)

unique(d$year)
#> [1] 2011 2013 2015 2017
m <- sdmTMB(
  data = d, formula = density ~ 1,
  spatiotemporal = "AR1", # using an AR1 to have something to forecast with
  extra_time = 2019L, # `L` for integer to match our data
  spatial = "off",
  time = "year", mesh = mesh, family = tweedie(link = "log")
)

# Add a year to our grid:
grid2019 <- qcs_grid_2011[qcs_grid_2011$year == max(qcs_grid_2011$year), ]
grid2019$year <- 2019L # `L` because `year` is an integer in the data
qcsgrid_forecast <- rbind(qcs_grid_2011, grid2019)

predictions <- predict(m, newdata = qcsgrid_forecast)
plot_map(predictions, exp(est)) +
  scale_fill_viridis_c(trans = "log10")

plot_map(predictions, epsilon_st) +
  scale_fill_gradient2()


# Estimating local trends ----------------------------------------------

d <- pcod
d$year_scaled <- as.numeric(scale(d$year))
mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 25)
m <- sdmTMB(data = d, formula = density ~ depth_scaled + depth_scaled2,
  mesh = mesh, family = tweedie(link = "log"),
  spatial_varying = ~ 0 + year_scaled, time = "year", spatiotemporal = "off")
nd <- qcs_grid
nd$year_scaled <- (nd$year - mean(d$year)) / sd(d$year)
p <- predict(m, newdata = nd)

plot_map(subset(p, year == 2003), zeta_s_year_scaled) + # pick any year
  ggtitle("Spatial slopes") +
  scale_fill_gradient2()


plot_map(p, est_rf) +
  ggtitle("Random field estimates") +
  scale_fill_gradient2()


plot_map(p, exp(est_non_rf)) +
  ggtitle("Prediction (fixed effects only)") +
  scale_fill_viridis_c(trans = "sqrt")


plot_map(p, exp(est)) +
  ggtitle("Prediction (fixed effects + all random effects)") +
  scale_fill_viridis_c(trans = "sqrt")