Fit a spatial or spatiotemporal GLMM with TMB. Particularly useful for species distribution models and relative abundance index standardization.

sdmTMB(
  formula,
  data,
  spde,
  time = NULL,
  family = gaussian(link = "identity"),
  time_varying = NULL,
  weights = NULL,
  extra_time = NULL,
  reml = FALSE,
  silent = TRUE,
  multiphase = TRUE,
  anisotropy = FALSE,
  control = sdmTMBcontrol(),
  enable_priors = FALSE,
  ar1_fields = FALSE,
  include_spatial = TRUE,
  spatial_trend = FALSE,
  spatial_only = identical(length(unique(data[[time]])), 1L),
  nlminb_loops = 1,
  newton_steps = 0,
  mgcv = TRUE,
  previous_fit = NULL,
  quadratic_roots = FALSE
)

Arguments

formula

Model formula. See the Details section below for how to specify offsets and threshold parameters. For index standardization, include 0 + as.factor(year) (or whatever the time column is called) in the formula.

data

A data frame.

spde

An object from make_mesh().

time

The time column (as character). Leave as NULL for a spatial-only model.

family

The family and link. Supports gaussian(), Gamma(), binomial(), poisson(), Beta(), nbinom2(), and tweedie()].

time_varying

An optional formula describing covariates that should be modelled as a random walk through time.

weights

Optional likelihood weights for the conditional model. Implemented as in glmmTMB. In other words, weights do not have to sum to one and are not internally modified.

extra_time

Optional extra time slices (e.g., years) to include for interpolation or forecasting with the predict function. See details section.

reml

Logical: use REML estimation rather than maximum likelihood?

silent

Silent or include optimization details?

multiphase

Logical: estimate the fixed and random effects in phases? Phases are usually faster and more stable.

anisotropy

Logical: allow for anisotropy? See plot_anisotropy().

control

Optimization control options. See sdmTMBcontrol().

enable_priors

Should weakly informative priors be enabled? Experimental and likely for use with the tmbstan package. Note that the priors are not yet sensible and Jacobian adjustments are not made. If you are interested in this functionality, please contact the developers.

ar1_fields

Estimate the spatiotemporal random fields as a stationary AR1 process?

include_spatial

Should a separate spatial random field be estimated? If enabled then there will be separate spatial and spatiotemporal fields.

spatial_trend

Should a separate spatial field be included in the trend? Requires spatiotemporal data.

spatial_only

Logical: should only a spatial model be fit (i.e. do not include spatiotemporal random effects)? By default a spatial-only model will be fit if there is only one unique value in the time column or the time argument is left at its default value of NULL.

nlminb_loops

How many times to run stats::nlminb() optimization. Sometimes restarting the optimizer at the previous best values aids convergence. If the maximum gradient is still too large, try increasing this to 2.

newton_steps

How many Newton optimization steps to try with stats::optimHess() after running stats::nlminb(). Sometimes aids convergence.

mgcv

Parse the formula with mgcv::gam()?

previous_fit

A previously fitted sdmTMB model to initialize the optimization with. Can greatly speed up fitting. Note that the data and model must be set up exactly the same way! However, the weights argument can change, which can be useful for cross-validation.

quadratic_roots

Experimental feature for internal use right now; may be moved to a branch. Logical: should quadratic roots be calculated? Note: on the sdmTMB side, the first two coefficients are used to generate the quadratic parameters. This means that if you want to generate a quadratic profile for depth, and depth and depth^2 are part of your formula, you need to make sure these are listed first and that an intercept isn't included. For example, formula = cpue ~ 0 + depth + depth2 + as.factor(year).

Details

Offsets

In the model formula, an offset can be included by including + offset in the model formula (a reserved word). The offset will be included in any prediction. offset must be a column in data.

Threshold models

A linear break-point relationship for a covariate can be included via + breakpt(variable) in the formula, where variable is a single covariate corresponding to a column in data. In this case the relationship is linear up to a point and then constant.

Similarly, a logistic-function threshold model can be included via + logistic(variable). This option models the relationship as a logistic function of the 50% and 95% values. This is similar to length- or size-based selectivity in fisheries, and is parameterized by the points at which f(x) = 0.5 or 0.95. See the vignette.

Note that only a single threshold covariate can be included.

Forecasting or interpolating

Extra time slices (e.g., years) can be included for interpolation or forecasting with the predict function via the extra_time argument. The predict function requires all time slices to be defined when fitting the model to ensure the various time indices are set up correctly. Be careful if including extra time slices that the model remains identifiable. For example, including + as.factor(year) in formula will render a model with no data to inform the expected value in the missing year. sdmTMB() makes no attempt to determine if the model makes sense for forecasting or interpolation. The options time_varying, include_spatial, ar1_fields, time = NULL provide mechanisms to predict over missing time slices.

Examples

d <- subset(pcod, year >= 2011) # subset for example speed pcod_spde <- make_mesh(d, c("X", "Y"), cutoff = 30) # a coarse mesh for example speed plot(pcod_spde)
# Tweedie: m <- sdmTMB(density ~ 0 + depth_scaled + depth_scaled2 + as.factor(year), data = d, time = "year", spde = pcod_spde, family = tweedie(link = "log")) print(m)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: density ~ 0 + depth_scaled + depth_scaled2 + as.factor(year) #> Time column: "year" #> SPDE: pcod_spde #> Data: d #> Family: tweedie(link = 'log') #> coef.est coef.se #> depth_scaled -1.58 0.20 #> depth_scaled2 -1.11 0.10 #> as.factor(year)2011 3.67 0.34 #> as.factor(year)2013 3.87 0.34 #> as.factor(year)2015 3.78 0.34 #> as.factor(year)2017 3.05 0.35 #> #> Matern range parameter: 14.29 #> Dispersion parameter: 14.16 #> Spatial SD (sigma_O): 2.65 #> Spatiotemporal SD (sigma_E): 1.84 #> ML criterion at convergence: 2955.205
tidy(m, conf.int = TRUE)
#> term estimate std.error conf.low conf.high #> 1 depth_scaled -1.576332 0.19996487 -1.968256 -1.1844078 #> 2 depth_scaled2 -1.108996 0.09901708 -1.303066 -0.9149263 #> 3 as.factor(year)2011 3.671943 0.34094724 3.003698 4.3401871 #> 4 as.factor(year)2013 3.873057 0.33816705 3.210262 4.5358527 #> 5 as.factor(year)2015 3.781371 0.33823581 3.118441 4.4443008 #> 6 as.factor(year)2017 3.051184 0.35024463 2.364717 3.7376510
tidy(m, effects = "ran_par", conf.int = TRUE)
#> term estimate std.error conf.low conf.high #> 1 sigma_O 2.649715 3.5506223 0.19168473 36.62779 #> 2 sigma_E 1.835313 2.3634025 0.14709085 22.89996 #> 3 sigma_Z 1.424891 2.0977456 0.07954827 25.52307 #> 4 range 14.286693 21.0330734 0.79759176 255.90736 #> 5 phi 14.155988 0.6919039 12.86281152 15.57918
# Run extra optimization steps to help convergence: m1 <- run_extra_optimization(m, nlminb_loops = 0, newton_steps = 1) max(m$gradients)
#> [1] 0.000457957
max(m1$gradients)
#> [1] 1.466514e-09
# Binomial: pcod_binom <- d pcod_binom$present <- ifelse(pcod_binom$density > 0, 1L, 0L) m_bin <- sdmTMB(present ~ 0 + as.factor(year) + depth_scaled + depth_scaled2, data = pcod_binom, time = "year", spde = pcod_spde, family = binomial(link = "logit")) print(m_bin)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: present ~ 0 + as.factor(year) + depth_scaled + depth_scaled2 #> Time column: "year" #> SPDE: pcod_spde #> Data: pcod_binom #> Family: binomial(link = 'logit') #> coef.est coef.se #> as.factor(year)2011 0.33 0.47 #> as.factor(year)2013 1.49 0.48 #> as.factor(year)2015 1.16 0.47 #> as.factor(year)2017 0.27 0.47 #> depth_scaled -2.15 0.28 #> depth_scaled2 -1.63 0.17 #> #> Matern range parameter: 27.49 #> Spatial SD (sigma_O): 2.51 #> Spatiotemporal SD (sigma_E): 0.22 #> ML criterion at convergence: 470.433
# Gaussian: pcod_gaus <- subset(d, density > 0 & year >= 2013) pcod_spde_gaus <- make_mesh(pcod_gaus, c("X", "Y"), cutoff = 30) m_pos <- sdmTMB(log(density) ~ 0 + as.factor(year) + depth_scaled + depth_scaled2, data = pcod_gaus, time = "year", spde = pcod_spde_gaus) print(m_pos)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: log(density) ~ 0 + as.factor(year) + depth_scaled + depth_scaled2 #> Time column: "year" #> SPDE: pcod_spde_gaus #> Data: pcod_gaus #> Family: gaussian(link = 'identity') #> coef.est coef.se #> as.factor(year)2013 3.63 0.12 #> as.factor(year)2015 3.68 0.13 #> as.factor(year)2017 3.36 0.15 #> depth_scaled -0.21 0.14 #> depth_scaled2 -0.39 0.10 #> #> Matern range parameter: 0.22 #> Dispersion parameter: 1.35 #> Spatial SD (sigma_O): 0.01 #> Spatiotemporal SD (sigma_E): 0.01 #> ML criterion at convergence: 599.014
# With splines via mgcv. # Make sure to pre-specify an appropriate basis dimension (`k`) since # the smoothers are not penalized in the current implementation. # See ?mgcv::choose.k m_gam <- sdmTMB(log(density) ~ 0 + as.factor(year) + s(depth_scaled, k = 4), data = pcod_gaus, time = "year", spde = pcod_spde_gaus) print(m_gam)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: log(density) ~ 0 + as.factor(year) + s(depth_scaled, k = 4) #> Time column: "year" #> SPDE: pcod_spde_gaus #> Data: pcod_gaus #> Family: gaussian(link = 'identity') #> coef.est coef.se #> as.factor(year)2013 3.45 0.11 #> as.factor(year)2015 3.60 0.12 #> as.factor(year)2017 3.23 0.14 #> s(depth_scaled).1 1.04 0.26 #> s(depth_scaled).2 -0.61 0.61 #> s(depth_scaled).3 -0.90 0.30 #> #> Matern range parameter: 0.21 #> Dispersion parameter: 1.32 #> Spatial SD (sigma_O): 0.01 #> Spatiotemporal SD (sigma_E): 0.01 #> ML criterion at convergence: 589.528
# \donttest{ # Fit a spatial only model: m <- sdmTMB( density ~ depth_scaled + depth_scaled2, data = d, spde = pcod_spde, family = tweedie(link = "log")) print(m)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: density ~ depth_scaled + depth_scaled2 #> Time column: NULL #> SPDE: pcod_spde #> Data: d #> Family: tweedie(link = 'log') #> coef.est coef.se #> (Intercept) 3.67 0.30 #> depth_scaled -1.49 0.19 #> depth_scaled2 -1.00 0.09 #> #> Matern range parameter: 0.03 #> Dispersion parameter: 14.78 #> Spatial SD (sigma_O): 1676.98 #> Spatiotemporal SD (sigma_E): not estimated #> ML criterion at convergence: 2974.181
# Spatial-trend example: m <- sdmTMB(density ~ depth_scaled, data = d, spde = pcod_spde, family = tweedie(link = "log"), spatial_trend = TRUE, time = "year") tidy(m, effects = "ran_par")
#> term estimate std.error #> 1 sigma_O 2.5786216 1.3024488 #> 2 sigma_E 0.8739983 0.5275146 #> 3 sigma_Z 0.3300048 0.1877423 #> 4 range 24.7819984 17.9633699 #> 5 phi 15.5051233 0.7261014
# Time-varying effects of depth and depth squared: m <- sdmTMB(density ~ 0 + as.factor(year), time_varying = ~ 0 + depth_scaled + depth_scaled2, data = d, time = "year", spde = pcod_spde, family = tweedie(link = "log")) print(m)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: density ~ 0 + as.factor(year) #> Time column: "year" #> SPDE: pcod_spde #> Data: d #> Family: tweedie(link = 'log') #> coef.est coef.se #> as.factor(year)2011 4.01 0.32 #> as.factor(year)2013 3.68 0.30 #> as.factor(year)2015 4.03 0.31 #> as.factor(year)2017 3.42 0.34 #> #> Time-varying parameters: #> coef.est coef.se #> depth_scaled-2011 -0.78 0.14 #> depth_scaled-2013 -0.75 0.13 #> depth_scaled-2015 -0.73 0.14 #> depth_scaled-2017 -1.02 0.24 #> depth_scaled2-2011 -1.76 0.25 #> depth_scaled2-2013 -0.79 0.13 #> depth_scaled2-2015 -1.53 0.21 #> depth_scaled2-2017 -2.07 0.34 #> #> Matern range parameter: 6.23 #> Dispersion parameter: 13.78 #> Spatial SD (sigma_O): 5.23 #> Spatiotemporal SD (sigma_E): 3.65 #> ML criterion at convergence: 2943.572
# See the b_rw_t estimates; these are the time-varying (random walk) effects. # These could be added to tidy.sdmTMB() eventually. summary(m$sd_report)[1:19,]
#> Estimate Std. Error #> b_j 4.0090872 0.31751960 #> b_j 3.6751811 0.30409948 #> b_j 4.0278776 0.31422339 #> b_j 3.4204687 0.33726406 #> ln_tau_O -2.1297412 13.60884306 #> ln_tau_E -1.7704078 13.59323378 #> ln_kappa -0.7894472 6.89097094 #> thetaf 0.3172205 0.06427312 #> ln_phi 2.6232966 0.04895034 #> ln_tau_V -1.5247832 0.84952336 #> ln_tau_V -0.2008949 0.45430295 #> b_rw_t -0.7777383 0.14304717 #> b_rw_t -0.7505321 0.12626466 #> b_rw_t -0.7338111 0.13539467 #> b_rw_t -1.0153736 0.23736181 #> b_rw_t -1.7649953 0.25180630 #> b_rw_t -0.7903728 0.12822862 #> b_rw_t -1.5261893 0.21146329 #> b_rw_t -2.0694461 0.34416970
# Linear breakpoint model on depth: m_pos <- sdmTMB(log(density) ~ 0 + as.factor(year) + breakpt(depth_scaled) + depth_scaled2, data = pcod_gaus, time = "year", spde = pcod_spde_gaus)
#> Warning: The model may not have converged: non-positive-definite Hessian matrix.
print(m_pos)
#> Spatial model fit by ML ['sdmTMB'] #> Formula: log(density) ~ 0 + as.factor(year) + breakpt(depth_scaled) + #> Formula: depth_scaled2 #> Time column: "year" #> SPDE: pcod_spde_gaus #> Data: pcod_gaus #> Family: gaussian(link = 'identity') #> coef.est coef.se #> as.factor(year)2013 4.24 NaN #> as.factor(year)2015 4.30 NaN #> as.factor(year)2017 4.03 NaN #> depth_scaled2 0.21 NaN #> depth_scaled-slope 1.35 NaN #> depth_scaled-breakpt -0.72 NaN #> #> Matern range parameter: 0.21 #> Dispersion parameter: 1.33 #> Spatial SD (sigma_O): 0.01 #> Spatiotemporal SD (sigma_E): 0.01 #> ML criterion at convergence: 592.503
# }