class: center, middle, inverse, title-slide # Example: Spatial modeling of trees in a rainforest ## NOAA PSAW Seminar Series ### March 2, 2022 --- <!-- Build with: xaringan::inf_mr() --> # The bei dataset .medium[ ```r dat <- data.frame( x = spatstat.data::bei$x, y = spatstat.data::bei$y ) head(dat) #> x y #> 1 11.7 151.1 #> 2 998.9 430.5 #> 3 980.1 433.5 #> 4 986.5 425.8 #> 5 944.1 415.1 #> 6 940.5 410.4 ``` ] --- # Spatial patterning of trees * Barro Colorado Island = classic in ecology * Hubbell's (2001) unified neutral theory of biodiversity <img src="03-tree-example_files/figure-html/plot-trees-1.png" width="700px" style="display: block; margin: auto;" /> --- # Lattice representation * Used in INLA book and other R packages ('spatstat') <img src="03-tree-example_files/figure-html/plot-trees-grid-1.png" width="700px" style="display: block; margin: auto;" /> --- # Should a spatial model be used? * Diagnostics: Moran's I .small[ ```r library(ape) inv_dists <- as.matrix(dist(dat[,c("x","y")])) diag(inv_dists) <- 0 Moran.I(dat$n, inv_dists) #> $observed #> [1] -0.02739762 #> #> $expected #> [1] -0.005649718 #> #> $sd #> [1] 0.003478684 #> #> $p.value #> [1] 4.058491e-10 ``` ] ??? Could also fit model using `gls()` to estimate spatial range, etc. --- # What type of distribution? * Counts * Skewed * Zero-truncated (locations of measured trees) <img src="03-tree-example_files/figure-html/tree-hist-1.png" width="700px" style="display: block; margin: auto;" /> --- # What type of distribution? * Likely best fit as a point process model for presence-only analysis (e.g., [Renner et al. 2015](https://doi.org/10.1111/2041-210X.12352)) * could be done with a Poisson family and weights in sdmTMB * For simplicity, we could ignore truncation and use `nbinom2()` * sdmTMB has several additional families for censored data: * `truncated_nbinom1()` * `truncated_nbinom2()` * `censored_poisson()` --- # Creating a mesh .small[ ```r mesh <- make_mesh( dat, xy_cols = c("x", "y"), cutoff = 80 # min. distance between knots in X-Y units ) mesh$mesh$n # extract number of vertices/knots #> [1] 78 # quick visualization of mesh plot(mesh) ``` <img src="03-tree-example_files/figure-html/make-mesh-1.png" width="340px" style="display: block; margin: auto;" /> ] --- # Fancier mesh visualizations .small[ ```r ggplot() + * inlabru::gg(mesh$mesh) + geom_point(data = dat, aes(x = x, y = y, col = n)) + coord_equal() ``` <img src="03-tree-example_files/figure-html/unnamed-chunk-3-1.png" width="700px" style="display: block; margin: auto;" /> ] --- # A first model ```r fit <- sdmTMB( n ~ 1, data = dat, mesh = mesh, family = truncated_nbinom2(link = "log"), ) ``` --- # Inspecting the model output .small[ ```r summary(fit) *#> Spatial model fit by ML ['sdmTMB'] #> Formula: n ~ 1 #> Mesh: mesh #> Data: dat #> Family: truncated_nbinom2(link = 'log') #> coef.est coef.se #> (Intercept) 2.68 0.37 #> #> Dispersion parameter: 3.61 #> Matern range: 233.47 #> Spatial SD: 1.02 #> ML criterion at convergence: 658.755 #> #> See ?tidy.sdmTMB to extract these values as a data frame. ``` ] --- # Inspecting the model output .small[ ```r summary(fit) #> Spatial model fit by ML ['sdmTMB'] *#> Formula: n ~ 1 *#> Mesh: mesh *#> Data: dat *#> Family: truncated_nbinom2(link = 'log') #> coef.est coef.se #> (Intercept) 2.68 0.37 #> #> Dispersion parameter: 3.61 #> Matern range: 233.47 #> Spatial SD: 1.02 #> ML criterion at convergence: 658.755 #> #> See ?tidy.sdmTMB to extract these values as a data frame. ``` ] --- # Inspecting the model output .small[ ```r summary(fit) #> Spatial model fit by ML ['sdmTMB'] #> Formula: n ~ 1 #> Mesh: mesh #> Data: dat #> Family: truncated_nbinom2(link = 'log') *#> coef.est coef.se *#> (Intercept) 2.68 0.37 #> #> Dispersion parameter: 3.61 #> Matern range: 233.47 #> Spatial SD: 1.02 #> ML criterion at convergence: 658.755 #> #> See ?tidy.sdmTMB to extract these values as a data frame. ``` ] --- # Inspecting the model output .small[ ```r summary(fit) #> Spatial model fit by ML ['sdmTMB'] #> Formula: n ~ 1 #> Mesh: mesh #> Data: dat #> Family: truncated_nbinom2(link = 'log') #> coef.est coef.se #> (Intercept) 2.68 0.37 #> *#> Dispersion parameter: 3.61 *#> Matern range: 233.47 *#> Spatial SD: 1.02 #> ML criterion at convergence: 658.755 #> #> See ?tidy.sdmTMB to extract these values as a data frame. ``` ] --- # Extracting parameters in a data frame ```r # Main effects: tidy(fit) #> term estimate std.error #> 1 (Intercept) 2.676762 0.3710162 # Variance-related terms: tidy(fit, effects = "ran_pars") #> term estimate std.error #> 1 range 233.469544 NA #> 3 phi 3.606949 NA #> 4 sigma_O 1.020086 NA ``` --- # Extracting parameters in a data frame .small[ ```r tidy(fit, "ran_pars", conf.int = TRUE) %>% as_tibble() #> # A tibble: 3 × 5 #> term estimate std.error conf.low conf.high #> <chr> <dbl> <lgl> <dbl> <dbl> #> 1 range 233. NA 132. 413. #> 2 phi 3.61 NA 2.58 5.04 #> 3 sigma_O 1.02 NA 0.780 1.33 ``` ] --- # Making predictions on original data * sdmTMB has a `predict()` function just like other packages * For help, see `?predict.sdmTMB` .small[ ```r p <- predict(fit) select(p, x, y, est:omega_s) %>% as_tibble() #> # A tibble: 178 × 6 #> x y est est_non_rf est_rf omega_s #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 0 0 3.25 2.68 0.572 0.572 #> 2 0 50 3.23 2.68 0.556 0.556 #> 3 0 100 3.22 2.68 0.540 0.540 #> 4 0 150 3.64 2.68 0.964 0.964 #> 5 0 200 4.07 2.68 1.39 1.39 #> 6 0 250 3.62 2.68 0.948 0.948 #> 7 0 300 3.18 2.68 0.507 0.507 #> 8 0 350 3.24 2.68 0.564 0.564 #> 9 0 400 3.30 2.68 0.621 0.621 #> 10 0 450 3.38 2.68 0.708 0.708 #> # … with 168 more rows ``` ] --- # What are the columns in the prediction output? * `est`: Overall estimate **in link space** (log here) * `est_non_rf`: Estimate of non-random-field components * `est_rf`: Estimate of random-field components * `omega_s`: Spatial random field --- # Making predictions on new data * Let's project to the whole region (on a grid) * Create a new data frame for predictions .small[ ```r # makes all combinations of x and y: newdf <- expand.grid( x = unique(dat$x), y = unique(dat$y) ) head(newdf) #> x y #> 1 0 0 #> 2 50 0 #> 3 100 0 #> 4 150 0 #> 5 200 0 #> 6 250 0 ``` ] --- # Making predictions on new data * `predict.sdmTMB()` can also take `newdata`: ```r p <- predict(fit, newdata = newdf) ``` See `?predict.sdmTMB` --- # Plotting fixed effects (the intercept here) .small[ ```r *ggplot(p, aes(x, y, fill = est_non_rf)) + geom_raster() + coord_fixed(expand = FALSE) ``` <img src="03-tree-example_files/figure-html/unnamed-chunk-8-1.png" width="700px" style="display: block; margin: auto;" /> ] --- # Plotting spatial random effects .small[ ```r *ggplot(p, aes(x, y, fill = omega_s)) + geom_raster() + scale_fill_gradient2() + coord_fixed(expand = FALSE) ``` <img src="03-tree-example_files/figure-html/unnamed-chunk-9-1.png" width="700px" style="display: block; margin: auto;" /> ] --- # Predictions in link (log) space .xsmall[ ```r *ggplot(p, aes(x, y, fill = est)) + geom_raster() + scale_fill_viridis_c() + coord_fixed(expand = FALSE) + geom_point(aes(x, y, size = n), data = dat, pch = 21, colour = "grey60", inherit.aes = FALSE) ``` <img src="03-tree-example_files/figure-html/unnamed-chunk-10-1.png" width="700px" style="display: block; margin: auto;" /> ] --- # Predictions in natural space .xsmall[ ```r # Hint: inverse link function also available in `fit$family$linkinv()` *ggplot(p, aes(x, y, fill = exp(est))) + geom_raster() + scale_fill_viridis_c() + coord_fixed(expand = FALSE) + geom_point(aes(x, y, size = n), data = dat, pch = 21, colour = "grey60", inherit.aes = FALSE) ``` <img src="03-tree-example_files/figure-html/unnamed-chunk-11-1.png" width="700px" style="display: block; margin: auto;" /> ]