class: center, middle, inverse, title-slide # Introduction to random fields ## NOAA PSAW Seminar Series ### March 2, 2022 --- <!-- Build with: xaringan::inf_mr() --> # Who we are <img src="images/intro_slide.png" width="650px" class="center" /> <!-- * Eric Ward, Northwest Fisheries Science Center (Seattle) --> <!-- * Sean Anderson, Fisheries and Oceans Canada (Nanaimo) --> <!-- * Lewis Barnett, Alaska Fisheries Science Center (Seattle) --> <!-- * Philina English, Fisheries and Oceans Canada (Nanaimo) --> --- # Plan for the 2-session course * Session 1 (March 2): describe models and walk through code with us * Session 2 (March 9): review optional exercise; introduce advanced features * Each day, we have ~1.5–2 hours of content, and will plan to take a 10–15 minute break halfway through. * Have questions? Use the [Google Doc](https://docs.google.com/document/d/1MxjLJ4Kz5ASIT15RQWApuf5xI1k-IHFTfd__rHAyuOM/edit?usp=sharing), ask during our break, or at the end. Thanks! --- # Motivating questions * Data often has spatial attributes * Ideal world: * Plug spatial covariates into a GLM / GLMM * Residuals are uncorrelated <img src="01-intro-random-fields_files/figure-html/sim-rf-intro-1.png" width="700px" style="display: block; margin: auto;" /> --- # Reality * Residual spatial autocorrelation <img src="01-intro-random-fields_files/figure-html/sim-rf-intro-cor-1.png" width="700px" style="display: block; margin: auto;" /> --- # Modeling spatial autocorrelation * Need 'wiggly'/smooth surface for approximating all spatial variables missing from model ('latent' variables) * Several equivalent approaches exist * Smooths in `mgcv()` * Random fields and the Stochastic Partial Differential Equation (SPDE) * SPDE differs in that it explicitly estimates parameters for spatial covariance function .xsmall[ Miller, D.L., Glennie, R. & Seaton, A.E. Understanding the Stochastic Partial Differential Equation Approach to Smoothing. JABES 25, 1–16 (2020) ] --- # Matérn covariance Flexible, can be exponential or Gaussian <img src="01-intro-random-fields_files/figure-html/matern-plot-1.png" width="700px" style="display: block; margin: auto;" /> --- # Predictive process models * Estimate spatial field as random effects * High dimensional datasets computationally challenging * Gaussian process predictive process models: * Estimate values at a subset of locations in the time series * 'knots', 'vertices', or 'control points' * Use covariance function to interpolate from knots to locations of observations --- # Predictive process models * More knots (vertical dashed lines) = more wiggliness & parameters to estimate <img src="01-intro-random-fields_files/figure-html/show-gp-1.png" width="700px" style="display: block; margin: auto;" /> --- # Spatial data types * Lattice: gridded data, e.g. interpolated SST from satellite observations -- * Areal: data collected in neighboring spatial areas, e.g. commercial catch records by state / county -- * Georeferenced: data where observations are associated with latitude and longitude * Locations may be unique or repeated (stations) --- # Why is space important? * Data covary spatially (data that are closer are more similar) -- * Relationship between distance and covariance can be described with a spatial covariance function -- * Covariance function in 2D may be * isotropic (same covariance in each direction) * anisotropic (different in each direction) <!-- * Assumed stationary --> --- # What is a random field? <img src="01-intro-random-fields_files/figure-html/random-field-demo-1.png" width="700px" style="display: block; margin: auto;" /> --- background-image: url("images/eagle.png") background-position: bottom right background-size: 35% # Random field <img src="images/rf-wikipedia.png" width="550px" /> --- background-image: url("images/beaker.png") background-position: bottom right background-size: 35% # Random field * A 2 dimensional "Gaussian Process" -- * A realization from a multivariate normal distribution with some covariance function --- background-image: url("images/elmo.png") background-position: bottom right background-size: 30% # Random field * A way of estimating a wiggly surface to account for spatial and/or spatiotemporal correlation in data. -- * Alternatively, a way of estimating a wiggly surface to account for "latent" or unobserved variables. -- * As a bonus, it provides useful covariance parameter estimates: spatial variance and the distance at data points are effectively uncorrelated ("range") <!-- TODO: include nugget / sill? Show slide with semivariogram image? --> --- # Many ways to simulate random fields * `RandomFields::RFsimulate()` simulates univariate / multivariate fields * `fields::sim.rf()` simulates random fields on a grid * `geoR::grf()` simulates random fields with irregular observations * `glmmfields::sim_glmmfields()` simulates random fields with/without extreme values * `sdmTMB::sdmTMB_simulate()` simulates univariate fields with `sdmTMB` ??? Homework: try to work through some of these yourself. Make some plots, and see how changing the covariance affects the smoothness of these fields. --- # Effects of changing variance and range <img src="01-intro-random-fields_files/figure-html/sim-rf-grid-1.png" width="700px" style="display: block; margin: auto;" /> --- # Effects of adding noise * Large observation error looks like noise * `\(\sigma_{obs}\)` >> `\(\sigma_{O}\)`, `\(\sigma_{E}\)` <img src="01-intro-random-fields_files/figure-html/sim-rf-large_phi-1.png" width="700px" style="display: block; margin: auto;" /> --- # Moderate observation errors * `\(\sigma_{obs}\)` = `\(\sigma_{O}\)` = `\(\sigma_{E}\)` <img src="01-intro-random-fields_files/figure-html/sim-rf-med_phi-1.png" width="700px" style="display: block; margin: auto;" /> --- # Small observation errors * `\(\sigma_{obs}\)` << `\(\sigma_{O}\)`, `\(\sigma_{E}\)` <img src="01-intro-random-fields_files/figure-html/sim-rf-small_phi-1.png" width="700px" style="display: block; margin: auto;" /> --- # Estimating random fields .small[ * Georeferenced data often involve 1000s or more points * Like in the 1-D setting, we need to approximate the spatial field * Options include nearest neighbor methods, covariance tapering, etc. * sdmTMB uses an approach from INLA * for VAST users, this is the same * INLA books: <https://www.r-inla.org/learnmore/books> ] --- # INLA and the SPDE approach .xsmall[ * SPDE: stochastic partial differential equation * The solution to a specific SPDE is a Gaussian random field (GRF) with Matérn covariance * This, and sparse precision matrices, let us efficiently fit approximations to GRFs to large spatial datasets * INLA is software that performs data wrangling for SPDE estimation * INLA also performs approximate Bayesian estimation * sdmTMB uses INLA to wrangle matrices, but uses TMB for maximum likelihood estimation ] .tiny[ Lindgren, F., Rue, H., and Lindström, J. 2011. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B. 73(4): 423–498. ] --- # Introducing meshes Implementing the SPDE with INLA requires constructing a 'mesh' <img src="01-intro-random-fields_files/figure-html/mesh-example-1.png" width="700px" style="display: block; margin: auto;" /> --- # Mesh construction .small[ * A unique mesh is generally made for each dataset * Rules of thumb: * More triangles = more computation time * More triangles = more fine-scale spatial predictions * Borders with coarser resolution reduce number of triangles * Use minimum edge size to avoid meshes becoming too fine * Want fewer vertices than data points * Triangle edge size needs to be smaller than spatial range * "How to make a bad mesh?" [Haakon Bakka's book](https://haakonbakkagit.github.io/btopic114.html) ] --- # Building your own mesh * `INLA::inla.mesh.2d()`: lets many arguments be customized * `INLA::meshbuilder()`: Shiny app for constructing a mesh, provides R code * Meshes can include barriers / islands / coastlines with shapefiles * INLA books <https://www.r-inla.org/learnmore/books> --- # Simplifying mesh construction in sdmTMB sdmTMB has a function `make_mesh()` to quickly construct a basic mesh Details in next set of slides --- # Example: cutoff = 50km <img src="01-intro-random-fields_files/figure-html/mesh-example4-1.png" width="700px" style="display: block; margin: auto;" /> --- # Example: cutoff = 25km <img src="01-intro-random-fields_files/figure-html/mesh-example3-1.png" width="700px" style="display: block; margin: auto;" /> --- # Example: cutoff = 10km <img src="01-intro-random-fields_files/figure-html/mesh-example2-1.png" width="700px" style="display: block; margin: auto;" />