class: center, middle, inverse, title-slide .title[ # Model selection with sdmTMB ] .subtitle[ ## IMR sdmTMB workshop ] .author[ ### ] .date[ ### May 23–25 2023 ] --- <!-- Build with: xaringan::inf_mr() --> # What is the goal of modeling? * Straight prediction? * in-sample, out-of-sample, both? * Parsimony? * balance of bias-variance tradeoff * Not all metrics appropriate for all applications --- # Too many metrics to discuss * Root mean squared error * AUC (see pseudo-absence example) * true and false positive rates * AIC: widely used tool in ecology * `\(\mathrm{AIC} = -2 \log L + 2K\)`, where `\(K\)` is the number of parameters * designed for fixed effects models (Burnham and Anderson 2002) --- # AIC and likelihood with sdmTMB * `AIC()` and `logLik()` methods work, just like `glm()` ```r mesh <- make_mesh(pcod, xy_cols = c("X", "Y"), cutoff = 10 ) fit <- sdmTMB( present ~ 1, data = pcod, mesh = mesh, family = binomial(link = "logit"), spatial = "on" ) logLik(fit) # log likelihood AIC(fit) # AIC ``` --- # When to use restricted maximum likelihood (REML) * Integrates over random effects *and* fixed effects; sometimes helps convergence too * *Can* use REML when comparing different random effect structures * *Don't* use REML when comparing alternative fixed effects * *Don't* use REML for index standardization .small[ ```r fit <- sdmTMB(..., reml = FALSE) ``` ] --- # Reminder: fixed and random effects in sdmTMB Random effects include * random intercepts: `(1 | group)` * smooth terms: `s(year)` * time-varying coefficients * all random fields * spatially varying coefficients (also random fields) --- # Limitations of AIC * Originally conceived for fixed effects * Burnham and Anderson (2002) * Approximate & problematic for mixed effects * Vaida and Blanchard (2005) * Liang et al. (2008) * Great [FAQ on `glmmTMB` by Ben Bolker](https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#can-i-use-aic-for-mixed-models-how-do-i-count-the-number-of-degrees-of-freedom-for-a-random-effect) --- # Predictive model selection * Ideal world: use cross validation to evaluate predictive accuracy * Split data into train / test sets * Objective function: * maximize log-likelihood of test data --- # Cross validation in sdmTMB .small[ ```r *fit <- sdmTMB_cv( present ~ 1, data = pcod, mesh = mesh, family = binomial(link = "logit"), * k_folds = 8, * fold_ids = NULL, * parallel = TRUE, * use_initial_fit = FALSE ) ``` ] * More folds = more computation time * `use_initial_fit = TRUE` * fits first fold, and initializes subsequent model fits from that --- # Cross validation in sdmTMB `sdmTMB_cv()` returns: - A list of models (each `sdmTMB()` object) - `fold_loglik`: sum of held out likelihoods for each fold - `sum_loglik`: sum across `fold_loglik`, or all data --- # How to choose folds? How many? .small[ * Words of wisdom: * Can be highly variable based on data, folds, sampling scheme * Spatial sampling or random? * [blockCV R package](https://cran.r-project.org/web/packages/blockCV/index.html), [Valavi et al. (2019)](https://doi.org/10.1111/2041-210X.13107) * How to sample with time / years? * LOOCV (leave-one-out...) vs. LFOCV (leave-future-out...) * `sdmTMB_cv()` does random fold assignment * Custom folds can be specified with `fold_ids` ] --- # Basic cross validation example Set a `future::plan()`; the folds will be fit in parallel ```r *library(future) *plan(multisession) mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 10) *m_cv <- sdmTMB_cv( density ~ s(depth, k = 5), data = pcod, mesh = mesh, family = tweedie(link = "log"), * k_folds = 2 ) # Sum of log likelihoods of left-out data: m_cv$sum_loglik #> [1] -22289.95 # Expected log pointwise predictive density # of left-out data: m_cv$elpd #> NULL ``` --- # Leave Future Out Cross Validation * Only forward looking .small[ ```r *fit <- sdmTMB_cv( present ~ 1, data = pcod, mesh = mesh, family = binomial(link = "logit"), * lfo = TRUE, * lfo_forecast = 1, * lfo_validations = 5 ) ``` --- # Model stacking (weighting) * It might be of use for some applications to do model averaging * This is implemented in sdmTMB with `stacking` * [Yao et al. (2018)](doi:10.1214/17-BA1091) --- # Model stacking (weighting) * If `m_cv_1`, `m_cv_2`, `m_cv_3` are all outputs from `sdmTMB_cv()` then ```r models <- list(m_cv_1, m_cv_2, m_cv_3) weights <- sdmTMB_stacking(models) weights ```