Example: Spatial modeling of trees in a rainforest

class: center, middle, inverse, title-slide

# Example: Spatial modeling of trees in a rainforest
## NOAA PSAW Seminar Series
### March 2, 2022

---

# The bei dataset

.medium[

```r
dat <- data.frame(
  x = spatstat.data::bei$x, 
  y = spatstat.data::bei$y
)
head(dat)
#>       x     y
#> 1  11.7 151.1
#> 2 998.9 430.5
#> 3 980.1 433.5
#> 4 986.5 425.8
#> 5 944.1 415.1
#> 6 940.5 410.4
```
]

---

# Spatial patterning of trees

* Barro Colorado Island = classic in ecology
  * Hubbell's (2001) unified neutral theory of biodiversity
  
<img src="03-tree-example_files/figure-html/plot-trees-1.png" width="700px" style="display: block; margin: auto;" />

---

# Lattice representation

* Used in INLA book and other R packages ('spatstat')

---

# Should a spatial model be used?

* Diagnostics: Moran's I

.small[

```r
library(ape)
inv_dists <- as.matrix(dist(dat[,c("x","y")]))
diag(inv_dists) <- 0
Moran.I(dat$n, inv_dists)
#> $observed
#> [1] -0.02739762
#> 
#> $expected
#> [1] -0.005649718
#> 
#> $sd
#> [1] 0.003478684
#> 
#> $p.value
#> [1] 4.058491e-10
```
]

???
Could also fit model using `gls()` to estimate spatial range, etc.
 
---

# What type of distribution?

* Counts
* Skewed
* Zero-truncated (locations of measured trees)

---

# What type of distribution?

* Likely best fit as a point process model for presence-only analysis (e.g., [Renner et al. 2015](https://doi.org/10.1111/2041-210X.12352))
  * could be done with a Poisson family and weights in sdmTMB
* For simplicity, we could ignore truncation and use `nbinom2()`
* sdmTMB has several additional families for censored data:
  * `truncated_nbinom1()`
  * `truncated_nbinom2()`
  * `censored_poisson()`
  
---

# Creating a mesh

.small[

```r
mesh <- make_mesh(
  dat, 
  xy_cols = c("x", "y"), 
  cutoff = 80 # min. distance between knots in X-Y units
)
mesh$mesh$n # extract number of vertices/knots
#> [1] 78

# quick visualization of mesh
plot(mesh)
```

<img src="03-tree-example_files/figure-html/make-mesh-1.png" width="340px" style="display: block; margin: auto;" />
]

---

# Fancier mesh visualizations

.small[

```r
ggplot() +
* inlabru::gg(mesh$mesh) +
  geom_point(data = dat, aes(x = x, y = y, col = n)) +
  coord_equal()
```

<img src="03-tree-example_files/figure-html/unnamed-chunk-3-1.png" width="700px" style="display: block; margin: auto;" />
]

---

# A first model

```r
fit <- sdmTMB(
  n ~ 1,
  data = dat,
  mesh = mesh,
  family = truncated_nbinom2(link = "log"),
)
```

---

# Inspecting the model output

.small[

```r
summary(fit)
*#> Spatial model fit by ML ['sdmTMB']
#> Formula: n ~ 1
#> Mesh: mesh
#> Data: dat
#> Family: truncated_nbinom2(link = 'log')
#>             coef.est coef.se
#> (Intercept)     2.68    0.37
#> 
#> Dispersion parameter: 3.61
#> Matern range: 233.47
#> Spatial SD: 1.02
#> ML criterion at convergence: 658.755
#> 
#> See ?tidy.sdmTMB to extract these values as a data frame.
```
]

---

# Inspecting the model output

.small[

```r
summary(fit)
#> Spatial model fit by ML ['sdmTMB']
*#> Formula: n ~ 1
*#> Mesh: mesh
*#> Data: dat
*#> Family: truncated_nbinom2(link = 'log')
#>             coef.est coef.se
#> (Intercept)     2.68    0.37
#> 
#> Dispersion parameter: 3.61
#> Matern range: 233.47
#> Spatial SD: 1.02
#> ML criterion at convergence: 658.755
#> 
#> See ?tidy.sdmTMB to extract these values as a data frame.
```
]

---

# Inspecting the model output

.small[

```r
summary(fit)
#> Spatial model fit by ML ['sdmTMB']
#> Formula: n ~ 1
#> Mesh: mesh
#> Data: dat
#> Family: truncated_nbinom2(link = 'log')
*#>             coef.est coef.se
*#> (Intercept)     2.68    0.37
#> 
#> Dispersion parameter: 3.61
#> Matern range: 233.47
#> Spatial SD: 1.02
#> ML criterion at convergence: 658.755
#> 
#> See ?tidy.sdmTMB to extract these values as a data frame.
```
]

---

# Inspecting the model output

.small[

```r
summary(fit)
#> Spatial model fit by ML ['sdmTMB']
#> Formula: n ~ 1
#> Mesh: mesh
#> Data: dat
#> Family: truncated_nbinom2(link = 'log')
#>             coef.est coef.se
#> (Intercept)     2.68    0.37
#> 
*#> Dispersion parameter: 3.61
*#> Matern range: 233.47
*#> Spatial SD: 1.02
#> ML criterion at convergence: 658.755
#> 
#> See ?tidy.sdmTMB to extract these values as a data frame.
```
]

---

# Extracting parameters in a data frame

```r
# Main effects:
tidy(fit)
#>          term estimate std.error
#> 1 (Intercept) 2.676762 0.3710162

# Variance-related terms:
tidy(fit, effects = "ran_pars")
#>      term   estimate std.error
#> 1   range 233.469544        NA
#> 3     phi   3.606949        NA
#> 4 sigma_O   1.020086        NA
```

---

# Extracting parameters in a data frame

.small[

```r
tidy(fit, "ran_pars", conf.int = TRUE) %>%
  as_tibble()
#> # A tibble: 3 × 5
#>   term    estimate std.error conf.low conf.high
#>   <chr>      <dbl> <lgl>        <dbl>     <dbl>
#> 1 range     233.   NA         132.       413.  
#> 2 phi         3.61 NA           2.58       5.04
#> 3 sigma_O     1.02 NA           0.780      1.33
```
]

---

# Making predictions on original data

* sdmTMB has a `predict()` function just like other packages
* For help, see `?predict.sdmTMB`

.small[

```r
p <- predict(fit)
select(p, x, y, est:omega_s) %>%
  as_tibble()
#> # A tibble: 178 × 6
#>        x     y   est est_non_rf est_rf omega_s
#>    <dbl> <dbl> <dbl>      <dbl>  <dbl>   <dbl>
#>  1     0     0  3.25       2.68  0.572   0.572
#>  2     0    50  3.23       2.68  0.556   0.556
#>  3     0   100  3.22       2.68  0.540   0.540
#>  4     0   150  3.64       2.68  0.964   0.964
#>  5     0   200  4.07       2.68  1.39    1.39 
#>  6     0   250  3.62       2.68  0.948   0.948
#>  7     0   300  3.18       2.68  0.507   0.507
#>  8     0   350  3.24       2.68  0.564   0.564
#>  9     0   400  3.30       2.68  0.621   0.621
#> 10     0   450  3.38       2.68  0.708   0.708
#> # … with 168 more rows
```
]

---

# What are the columns in the prediction output?

* `est`: Overall estimate **in link space** (log here)
* `est_non_rf`: Estimate of non-random-field components
* `est_rf`: Estimate of random-field components
* `omega_s`: Spatial random field

---

# Making predictions on new data

* Let's project to the whole region (on a grid)
* Create a new data frame for predictions

.small[

```r
# makes all combinations of x and y:
newdf <- expand.grid( 
  x = unique(dat$x),
  y = unique(dat$y)
)
head(newdf)
#>     x y
#> 1   0 0
#> 2  50 0
#> 3 100 0
#> 4 150 0
#> 5 200 0
#> 6 250 0
```
]

---

# Making predictions on new data

* `predict.sdmTMB()` can also take `newdata`:

```r
p <- predict(fit, newdata = newdf)
```

See `?predict.sdmTMB`

---

# Plotting fixed effects (the intercept here)

.small[

```r
*ggplot(p, aes(x, y, fill = est_non_rf)) +
  geom_raster() +
  coord_fixed(expand = FALSE)
```

<img src="03-tree-example_files/figure-html/unnamed-chunk-8-1.png" width="700px" style="display: block; margin: auto;" />
]

---

# Plotting spatial random effects

.small[

```r
*ggplot(p, aes(x, y, fill = omega_s)) +
  geom_raster() +
  scale_fill_gradient2() +
  coord_fixed(expand = FALSE)
```

<img src="03-tree-example_files/figure-html/unnamed-chunk-9-1.png" width="700px" style="display: block; margin: auto;" />
]

---

# Predictions in link (log) space

.xsmall[

```r
*ggplot(p, aes(x, y, fill = est)) +
  geom_raster() + scale_fill_viridis_c() +
  coord_fixed(expand = FALSE) +
  geom_point(aes(x, y, size = n), 
    data = dat, pch = 21, colour = "grey60", inherit.aes = FALSE)
```

<img src="03-tree-example_files/figure-html/unnamed-chunk-10-1.png" width="700px" style="display: block; margin: auto;" />
]

---

# Predictions in natural space

.xsmall[

```r
# Hint: inverse link function also available in `fit$family$linkinv()`
*ggplot(p, aes(x, y, fill = exp(est))) +
  geom_raster() + scale_fill_viridis_c() +
  coord_fixed(expand = FALSE) +
  geom_point(aes(x, y, size = n), 
    data = dat, pch = 21, colour = "grey60", inherit.aes = FALSE)
```

<img src="03-tree-example_files/figure-html/unnamed-chunk-11-1.png" width="700px" style="display: block; margin: auto;" />
]