Make predictions from an sdmTMB model; can predict on the original or new data.
Usage
# S3 method for class 'sdmTMB'
predict(
object,
newdata = NULL,
type = c("link", "response"),
se_fit = FALSE,
re_form = NULL,
re_form_iid = NULL,
nsim = 0,
sims_var = "est",
model = c(NA, 1, 2),
offset = NULL,
mcmc_samples = NULL,
return_tmb_object = FALSE,
return_tmb_report = FALSE,
return_tmb_data = FALSE,
tmbstan_model = deprecated(),
sims = deprecated(),
area = deprecated(),
...
)
Arguments
- object
A model fitted with
sdmTMB()
.- newdata
A data frame to make predictions on. This should be a data frame with the same predictor columns as in the fitted data and a time column (if this is a spatiotemporal model) with the same name as in the fitted data.
- type
Should the
est
column be in link (default) or response space?- se_fit
Should standard errors on predictions at the new locations given by
newdata
be calculated? Warning: the current implementation can be slow for large data sets or high-resolution projections unlessre_form = NA
(omitting random fields). A faster option to approximate point-wise uncertainty is to use thensim
argument.- re_form
NULL
to specify including all spatial/spatiotemporal random effects in predictions.~0
orNA
for population-level predictions. Likely to be used in conjunction withse_fit = TRUE
. This does not affectget_index()
calculations.- re_form_iid
NULL
to specify including all random intercepts in the predictions.~0
orNA
for population-level predictions. No other options (e.g., some but not all random intercepts) are implemented yet. Only affects predictions withnewdata
. This does affectsget_index()
.- nsim
If
> 0
, simulate from the joint precision matrix withnsim
draws. Returns a matrix ofnrow(data)
bynsim
representing the estimates of the linear predictor (i.e., in link space). Can be useful for deriving uncertainty on predictions (e.g.,apply(x, 1, sd)
) or propagating uncertainty. This is currently the fastest way to characterize uncertainty on predictions in space with sdmTMB.- sims_var
Experimental: Which TMB reported variable from the model should be extracted from the joint precision matrix simulation draws? Defaults to link-space predictions. Options include:
"omega_s"
,"zeta_s"
,"epsilon_st"
, and"est_rf"
(as described below). Other options will be passed verbatim.- model
Type of prediction if a delta/hurdle model and
nsim > 0
ormcmc_samples
is supplied:NA
returns the combined prediction from both components on the link scale for the positive component;1
or2
return the first or second model component only on the link or response scale depending on the argumenttype
. For regular prediction from delta models, both sets of predictions are returned.- offset
A numeric vector of optional offset values. If left at default
NULL
, the offset is implicitly left at 0.- mcmc_samples
See
extract_mcmc()
in the sdmTMBextra package for more details and the Bayesian vignette. If specified, the predict function will return a matrix of a similar form as ifnsim > 0
but representing Bayesian posterior samples from the Stan model.- return_tmb_object
Logical. If
TRUE
, will include the TMB object in a list format output. Necessary for theget_index()
orget_cog()
functions.- return_tmb_report
Logical: return the output from the TMB report? For regular prediction, this is all the reported variables at the MLE parameter values. For
nsim > 0
or whenmcmc_samples
is supplied, this is a list where each element is a sample and the contents of each element is the output of the report for that sample.- return_tmb_data
Logical: return formatted data for TMB? Used internally.
- tmbstan_model
Deprecated. See
mcmc_samples
.- sims
Deprecated. Please use
nsim
instead.- area
Deprecated. Please use
area
inget_index()
.- ...
Not implemented.
Value
If return_tmb_object = FALSE
(and nsim = 0
and mcmc_samples = NULL
):
A data frame:
est
: Estimate in link space (everything is in link space)est_non_rf
: Estimate from everything that isn't a random fieldest_rf
: Estimate from all random fields combinedomega_s
: Spatial (intercept) random field that is constant through timezeta_s
: Spatial slope random fieldepsilon_st
: Spatiotemporal (intercept) random fields, could be off (zero), IID, AR1, or random walk
If return_tmb_object = TRUE
(and nsim = 0
and mcmc_samples = NULL
):
A list:
data
: The data frame described abovereport
: The TMB report on parameter valuesobj
: The TMB object returned from the prediction runfit_obj
: The original TMB model object
In this case, you likely only need the data
element as an end user.
The other elements are included for other functions.
If nsim > 0
or mcmc_samples
is not NULL
:
A matrix:
Columns represent samples
Rows represent predictions with one row per row of
newdata
Examples
d <- pcod_2011
mesh <- make_mesh(d, c("X", "Y"), cutoff = 30) # a coarse mesh for example speed
m <- sdmTMB(
data = d, formula = density ~ 0 + as.factor(year) + depth_scaled + depth_scaled2,
time = "year", mesh = mesh, family = tweedie(link = "log")
)
# Predictions at original data locations -------------------------------
predictions <- predict(m)
head(predictions)
#> # A tibble: 6 × 17
#> year X Y depth density present lat lon depth_mean depth_sd
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2011 435. 5718. 241 245. 1 51.6 -130. 5.16 0.445
#> 2 2011 487. 5719. 52 0 0 51.6 -129. 5.16 0.445
#> 3 2011 490. 5717. 47 0 0 51.6 -129. 5.16 0.445
#> 4 2011 545. 5717. 157 0 0 51.6 -128. 5.16 0.445
#> 5 2011 404. 5720. 398 0 0 51.6 -130. 5.16 0.445
#> 6 2011 420. 5721. 486 0 0 51.6 -130. 5.16 0.445
#> # ℹ 7 more variables: depth_scaled <dbl>, depth_scaled2 <dbl>, est <dbl>,
#> # est_non_rf <dbl>, est_rf <dbl>, omega_s <dbl>, epsilon_st <dbl>
predictions$resids <- residuals(m) # randomized quantile residuals
#> Note what used to be the default sdmTMB residuals (before version 0.4.3.9005)
#> are now `type = 'mle-eb'`. We recommend using the current default `'mle-mvn'`,
#> which takes one sample from the approximate posterior of the random effects or
#> `dharma_residuals()` using a similar approach.
library(ggplot2)
ggplot(predictions, aes(X, Y, col = resids)) + scale_colour_gradient2() +
geom_point() + facet_wrap(~year)
hist(predictions$resids)
qqnorm(predictions$resids);abline(a = 0, b = 1)
# Predictions onto new data --------------------------------------------
qcs_grid_2011 <- replicate_df(qcs_grid, "year", unique(pcod_2011$year))
predictions <- predict(m, newdata = qcs_grid_2011)
# \donttest{
# A short function for plotting our predictions:
plot_map <- function(dat, column = est) {
ggplot(dat, aes(X, Y, fill = {{ column }})) +
geom_raster() +
facet_wrap(~year) +
coord_fixed()
}
plot_map(predictions, exp(est)) +
scale_fill_viridis_c(trans = "sqrt") +
ggtitle("Prediction (fixed effects + all random effects)")
plot_map(predictions, exp(est_non_rf)) +
ggtitle("Prediction (fixed effects and any time-varying effects)") +
scale_fill_viridis_c(trans = "sqrt")
plot_map(predictions, est_rf) +
ggtitle("All random field estimates") +
scale_fill_gradient2()
plot_map(predictions, omega_s) +
ggtitle("Spatial random effects only") +
scale_fill_gradient2()
plot_map(predictions, epsilon_st) +
ggtitle("Spatiotemporal random effects only") +
scale_fill_gradient2()
# Visualizing a marginal effect ----------------------------------------
# See the visreg package or the ggeffects::ggeffect() or
# ggeffects::ggpredict() functions
# To do this manually:
nd <- data.frame(depth_scaled =
seq(min(d$depth_scaled), max(d$depth_scaled), length.out = 100))
nd$depth_scaled2 <- nd$depth_scaled^2
# Because this is a spatiotemporal model, you'll need at least one time
# element. If time isn't also a fixed effect then it doesn't matter what you pick:
nd$year <- 2011L # L: integer to match original data
p <- predict(m, newdata = nd, se_fit = TRUE, re_form = NA)
ggplot(p, aes(depth_scaled, exp(est),
ymin = exp(est - 1.96 * est_se), ymax = exp(est + 1.96 * est_se))) +
geom_line() + geom_ribbon(alpha = 0.4)
# Plotting marginal effect of a spline ---------------------------------
m_gam <- sdmTMB(
data = d, formula = density ~ 0 + as.factor(year) + s(depth_scaled, k = 5),
time = "year", mesh = mesh, family = tweedie(link = "log")
)
if (require("visreg", quietly = TRUE)) {
visreg::visreg(m_gam, "depth_scaled")
}
# or manually:
nd <- data.frame(depth_scaled =
seq(min(d$depth_scaled), max(d$depth_scaled), length.out = 100))
nd$year <- 2011L
p <- predict(m_gam, newdata = nd, se_fit = TRUE, re_form = NA)
ggplot(p, aes(depth_scaled, exp(est),
ymin = exp(est - 1.96 * est_se), ymax = exp(est + 1.96 * est_se))) +
geom_line() + geom_ribbon(alpha = 0.4)
# Forecasting ----------------------------------------------------------
mesh <- make_mesh(d, c("X", "Y"), cutoff = 15)
unique(d$year)
#> [1] 2011 2013 2015 2017
m <- sdmTMB(
data = d, formula = density ~ 1,
spatiotemporal = "AR1", # using an AR1 to have something to forecast with
extra_time = 2019L, # `L` for integer to match our data
spatial = "off",
time = "year", mesh = mesh, family = tweedie(link = "log")
)
# Add a year to our grid:
grid2019 <- qcs_grid_2011[qcs_grid_2011$year == max(qcs_grid_2011$year), ]
grid2019$year <- 2019L # `L` because `year` is an integer in the data
qcsgrid_forecast <- rbind(qcs_grid_2011, grid2019)
predictions <- predict(m, newdata = qcsgrid_forecast)
plot_map(predictions, exp(est)) +
scale_fill_viridis_c(trans = "log10")
plot_map(predictions, epsilon_st) +
scale_fill_gradient2()
# Estimating local trends ----------------------------------------------
d <- pcod
d$year_scaled <- as.numeric(scale(d$year))
mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 25)
m <- sdmTMB(data = d, formula = density ~ depth_scaled + depth_scaled2,
mesh = mesh, family = tweedie(link = "log"),
spatial_varying = ~ 0 + year_scaled, time = "year", spatiotemporal = "off")
nd <- replicate_df(qcs_grid, "year", unique(pcod$year))
nd$year_scaled <- (nd$year - mean(d$year)) / sd(d$year)
p <- predict(m, newdata = nd)
plot_map(subset(p, year == 2003), zeta_s_year_scaled) + # pick any year
ggtitle("Spatial slopes") +
scale_fill_gradient2()
plot_map(p, est_rf) +
ggtitle("Random field estimates") +
scale_fill_gradient2()
plot_map(p, exp(est_non_rf)) +
ggtitle("Prediction (fixed effects only)") +
scale_fill_viridis_c(trans = "sqrt")
plot_map(p, exp(est)) +
ggtitle("Prediction (fixed effects + all random effects)") +
scale_fill_viridis_c(trans = "sqrt")
# }