BCCM Results

Travis Tai

Last rendered on 09 November, 2023

Introduction

The British Columbia Continental Margin (BCCM) model is a physical biogeochemical oceanographic model, implemented using the Regional Ocean Modeling System (ROMS). It has a horizontal resolution of 3km x 3km and a vertical discretization, based on bathymetry, of 42 depth levels that increase in resolution near the surface. The BCCM data in pacea are adapted from results provided by Angelica Peña at the Institute of Ocean Sciences (Fisheries and Oceans Canada). For model details see Peña et al. (2019), which should be cited if you use the results.

The results are too big to be data objects in pacea, but are easily downloadable from https://github.com/pbs-assess/pacea-data and cached locally on your computer, as demonstrated below. The user does not need to deal with that website, pacea functions take care of accessing it.

This vignette describes the BCCM results that are available in pacea, and how to use and plot them for your own analyses.

library(pacea)
library(dplyr)
library(sf)
library(ggplot2)

Spatial resolution of the bccm results in pacea

First, pacea contains the objects bc_coast and bc_eez, to show the coastline and Canadian Exclusive Economic Zone (EEZ) off British Columbia. These two objects are data frames that are also sf (simple features) objects, which means that they have spatial and non-spatial attributes. We can use them to plot the coastline and the EEZ (which is the area we are interested in for Canadian stock assessments):

ggplot() +
  geom_sf(data = bc_eez, col = "black", fill = "purple", lty = 2) +
  geom_sf(data = bc_coast)

Results from the BCCM have been processed for pacea and saved on an inshore 2km x 2km grid and an offshore 6km x 6km grid, within the EEZ. The areas for the two grids are the sf objects inshore_poly and offshore_poly, and the combined area is the object bccm_eez_poly. We can plot these (plus the coastline and EEZ) as

ggplot() +
  geom_sf(data = bccm_eez_poly, col = "black", lwd = 2) +
  geom_sf(data = inshore_poly, col = NA, fill = "green") +
  geom_sf(data = offshore_poly, col = NA, fill = "blue") +
  geom_sf(data = bc_eez, col = "black", fill = NA, lty = 2) +
  geom_sf(data = bc_coast)       # Generally best to plot this last

The dashed line is the EEZ, and the combined area bccm_eez_poly (thick solid line) is slightly larger because it has a 10 km buffer added to it, and it does not include a portion in the northwest because this was not in the BCCM model.

We can print these various objects, two of them are

bc_coast
# Simple feature collection with 1 feature and 0 fields
# Geometry type: MULTIPOLYGON
# Dimension:     XY
# Bounding box:  xmin: -134.3472 ymin: 46 xmax: -120 ymax: 56
# Geodetic CRS:  WGS 84
#                         geometry
# 1 MULTIPOLYGON (((-122.6533 4...
bccm_eez_poly
# Simple feature collection with 1 feature and 0 fields
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 164671.6 ymin: 163875 xmax: 1269171 ymax: 1103817
# Projected CRS: NAD83 / BC Albers
#                         geometry
# 1 POLYGON ((390852.7 1057149,...

Note that the ‘Geodetic CRS’ (co-ordinate reference system) is ‘WGS 84’ for the first one and ‘NAD83 / BC Albers’ for the second. Fortunately, ggplot() recognizes these differences and will reproject the layers to match the first layer. It also provides the axes units in degrees, regardless of projection. Adding layers with different projections using the base plot() function would not work.

The actual grid (2km x 2km and 6km x 6km) of cells on which the BCCM results are saved can also be viewed, though the cells may not be clearly visible if the resolution of the plot is too low:

ggplot() + geom_sf(data = grid26)

All data objects in pacea have help files associated with them, viewed with, for example, ?bccm_eez_poly, and we recommend reading them for more details.

Downloading BCCM data

The BCCM results were wrangled to simplify data format and structure. Because the results are spatially and temporally resolved, the compressed file size of each variable is fairly large (between 30-40 Mb). To keep the pacea package a reasonable downloadable size, the BCCM results are stored in a separate GitHub repository (https://github.com/pbs-assess/pacea-data) and can be downloaded locally using functions within pacea (obviously requiring internet access). The downloading is only needed once as the files get cached locally on your computer.

The list of variables of the BCCM results available can be viewed from the list bccm_data. BCCM results can be accessed using placeholder functions, named as in bccm_data, that download the data to a local cache folder - this directory can be identified using pacea_cache(). For example, if one wanted the BCCM surface temperature data, they would use the bccm_surface_temperature() function to download it.

View data function help files for more information on bccm results (e.g. ?bccm_bottom_oxygen)

# list available BCCM data
bccm_data
#                        data_name
# 1            bccm_surface_oxygen
# 2                bccm_surface_ph
# 3          bccm_surface_salinity
# 4       bccm_surface_temperature
# 5          bccm_avg0to40m_oxygen
# 6              bccm_avg0to40m_ph
# 7        bccm_avg0to40m_salinity
# 8     bccm_avg0to40m_temperature
# 9        bccm_avg40to100m_oxygen
# 10           bccm_avg40to100m_ph
# 11     bccm_avg40to100m_salinity
# 12  bccm_avg40to100m_temperature
# 13      bccm_avg100mtoBot_oxygen
# 14          bccm_avg100mtoBot_ph
# 15    bccm_avg100mtoBot_salinity
# 16 bccm_avg100mtoBot_temperature
# 17            bccm_bottom_oxygen
# 18                bccm_bottom_ph
# 19          bccm_bottom_salinity
# 20       bccm_bottom_temperature
# 21            bccm_phytoplankton
# 22        bccm_primaryproduction
# local cache folder
pacea_cache()
"C:\\Users\\<your-windows-username>\\AppData\\Local/pacea/Cache"
# On linux and Macs this will look different.
# download data to cache folder and name object; `force = TRUE` to skip user prompt
pdata <- bccm_surface_temperature(force = TRUE)

If force = FALSE, the user will be prompted on whether to download the data to the cache folder.

Once the data have been downloaded into the cache folder, the placeholder function will simply load the data into the current workspace (and not redownload it from the pacea-data repository).

dim(pdata)
# [1] 40580   325

# the data have 325 columns so we'll just preview the first 5 columns
head(pdata[, 1:5])
# Simple feature collection with 6 features and 5 fields
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 575612.5 ymin: 1101653 xmax: 587612.5 ymax: 1103653
# Projected CRS: NAD83 / BC Albers
# # A tibble: 6 × 6
#   `1993_1` `1993_2` `1993_3` `1993_4` `1993_5`                          geometry
#      <dbl>    <dbl>    <dbl>    <dbl>    <dbl>                     <POLYGON [m]>
# 1     5.28     5.35     5.56     6.78     9.34 ((575612.5 1103653, 577612.5 110…
# 2     5.26     5.32     5.54     6.76     9.34 ((577612.5 1103653, 579612.5 110…
# 3     5.31     5.32     5.54     6.74     9.29 ((579612.5 1103653, 581612.5 110…
# 4     5.40     5.35     5.56     6.75     9.28 ((581612.5 1103653, 583612.5 110…
# 5     5.40     5.32     5.55     6.78     9.33 ((583612.5 1103653, 585612.5 110…
# 6     5.44     5.35     5.59     6.81     9.34 ((585612.5 1103653, 587612.5 110…

NOTE: The geometry column is still included when selecting columns from an sf object.

The data are in wide format, with each column representing a unique year-month combination. The data also have various attributes, such as the units for the data values.

# names of the various attributes
names(attributes(pdata))
# [1] "names"     "row.names" "class"     "sf_column" "agr"       "units"

# units
attributes(pdata)$units
#         vars_units 
# "Temperature (°C)"

The class of BCCM data is a pacea_st object, in addition to sf, tbl, and dataframe objects.

class(pdata)
# [1] "pacea_st"   "sf"         "tbl_df"     "tbl"        "data.frame"

Downloading all BCCM data

Users can also choose to download all variables at once using bccm_all_variables(). Then users can use the placeholder functions to call data into the environment.

NOTE: this will take several minutes to download all data layers.

# download all variables
bccm_all_variables()

# call variable data into environment
bccm_surface_temperature()

Updating BCCM data

When you download BCCM data, the most recent version of the data will be downloaded to your cache folder.

Occasionally, there will be updates to the BCCM data (e.g. adding data from more recent years). If you want to check for an update for a data file you have already downloaded to the pacea_cache directory:

pdata <- bccm_surface_temperature(update = TRUE)
# Warning: Most recent version of data already downloaded in cache folder!

If there is an update available, you will be prompted on whether to download the data and replace the older version you have in the pacea cache directory. If declined, the current version of the data will be loaded to the workspace. Also keep an eye on the NEWS file on GitHub.

BCCM visualization

Here, we plot the BCCM data using some built in package functions, as well as ggplot2. We introduce some of the available polygon simple feature layers that can be plotted with the BCCM data.

Plotting with plot()

BCCM data can be plotted using the plot() function for a quick view of the data. We used the generic plot function for BCCM class data and integrated ggplot functionality. The default settings are to plot the April, 2018. However, users can specify any time period(s) available in the data (see e.g. ?bccm_surface_temperature for details on BCCM data).

# data should already be downloaded to cache folder
pdata <- bccm_surface_temperature()

plot(pdata)

Other months and years can be selected, and month selections can be done with abbreviations, full names, or numbers.

plot(pdata, months = c("June", "September"), years = c(1995, 2010, 2019))


plot(pdata, months = c("Jan", "10"), years = c(1995, 2010, 2019))

The BC coastline and Exclusive Economic Zone (EEZ) can also be excluded.

plot(pdata, months = c("September"), years = c(2019), bc = FALSE, eez = FALSE)

The different BCCM variables available are assigned different colour palettes

# not run

# pdata_suroxy <- bccm_surface_oxygen()
# plot(pdata_suroxy)

More customizations and using ggplot

Using ggplot2 is recommended if you would like to customize the plots further especially for multipanel plots.

Since the data are in wide format, it needs to be wrangled into long format for ggplot(). There is a built in ‘pacea’ function for wrangling BCCM data to long format.

# months and years of interest
yrs <- c(1999, 2009, 2019)
mths <- c(8) # august
ym <- paste(yrs, mths, sep = "_")

# subset data based on year_month column name - reduces processing time
tdat <- pdata %>%
  dplyr::select(all_of(ym))

# convert to long format
sub_dat <- pacea_long(tdat)

# plot selected years and month
p1 <- ggplot(data = sub_dat) +
  geom_sf(aes(fill = value), col = NA) +
  facet_grid(month~year)

p1

Customize colour scale for plot and add legend label

p1 <- p1 +
  scale_fill_gradientn(colours = c("blue", "grey", "red"), name = attributes(pdata)$units)

p1

Adding BC coast layer and BC EEZ layer

p2 <- p1 +
  geom_sf(data = bc_eez, fill = NA, lty = 2) +
  geom_sf(data = bc_coast)

p2

NOTE: Again, ggplot() recognizes that the projection of bc_coast and bc_eez shapefile (WGS 84) are different than the BCCM data (NAD 83), and will reproject the layers to match the first layer.

You can customize the plot using typical ggplot syntax (see ?ggplot for more information)

p2 +
  theme_classic() +
  theme(strip.background = element_blank()) +
  ggtitle("Sea surface temperature")

Masking data with shapefile

We can select an area of interest by masking the data with another sf shapefile, such as for a fishing region using indexing syntax. Below is an example using coordinates from DFO fish area 126.

# coordinates for polygon (i.e. fishing area 126)
crds <- list(matrix(c(-127.1506, -128.2331, -129.3492, -127.9167, -127.1847, -126.8200, -127.1506,
                      49.85766, 49.00000, 48.99991, 50.11915, 50.40183, 50.24466, 49.85766),
                    ncol = 2))

# create polygon object with lat-lon WGS 84 projection (4326) and convert to sf object with BC Albers (3005) projection
a126 <- st_sfc(st_polygon(crds), crs = 4326) %>%
  st_as_sf() %>%
  st_transform(crs = 3005)

# mask data by indexing sf object with area 126 shapefile
reg.dat <- sub_dat[a126, ]

# reduction in number of data points after subsetting the region
dim(sub_dat)
# [1] 121740      4
dim(reg.dat)
# [1] 4821    4

The data went from 123,864 rows to 4,821 rows, which is a much smaller region.

# plot panel of data
reg.p1 <- ggplot(data = reg.dat) +
  geom_sf(aes(fill = value), col = NA) +
  facet_grid(month~year) +
  scale_fill_gradientn(colours = c("blue", "grey", "red"), name = attributes(pdata)$units)

reg.p1

Now to plot it with the bc_coast shapefile to see where this region is in relation to the coastline.

# add bc_coast layer and polygon boundary in red
reg.p2 <- reg.p1 +
  geom_sf(data = a126, fill = NA, col= "red") +
  geom_sf(data = bc_coast)

reg.p2

Let’s zoom to the area of interest, in this case our polygon that represents Area 126.

# identify extent of polygon with a 50 km buffer
tbbox <- st_bbox(st_buffer(a126, dist = 50000))
tbbox
#     xmin     ymin     xmax     ymax 
# 704709.7 395218.7 991522.4 649193.8

# use the bounding box to set the plotting limits
reg.p2 +
  coord_sf(xlim = c(tbbox[c(1,3)]),
           ylim = c(tbbox[c(2,4)]))

Wrangling BCCM data

Here are some examples of how to easy wrangle BCCM data that may be of interest in your analyses. We will continue to use pdata which was the bccm_surface_temperature() data we loaded to the workspace.

Obtain centroids of each cell

Currently, the BCCM data are simple feature (sf) objects of polygons (gridded square cells). The polygon boundaries are identified by coordinates for the perimeter of the cell. We can also obtain the centroids of each cell.

# First let's subset the data to reduce process time
# months and years of interest
yrs <- c(1999, 2009, 2019)
mths <- c(8) # august
ym <- paste(yrs, mths, sep = "_")

# subset data based on year_month column name - reduces processing time
tdat <- pdata %>%
  dplyr::select(all_of(ym))

# this will replace convert the geometry column from a polygon to a point, with the centroid of the gridded cell
centroid.dat <- tdat %>% st_centroid()
# Warning: st_centroid assumes attributes are constant over geometries

head(centroid.dat)
# Simple feature collection with 6 features and 3 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 576612.5 ymin: 1102653 xmax: 586612.5 ymax: 1102653
# Projected CRS: NAD83 / BC Albers
# # A tibble: 6 × 4
#   `1999_8` `2009_8` `2019_8`           geometry
#      <dbl>    <dbl>    <dbl>        <POINT [m]>
# 1     11.9     11.8     13.5 (576612.5 1102653)
# 2     12.0     11.8     13.6 (578612.5 1102653)
# 3     12.0     11.8     13.6 (580612.5 1102653)
# 4     12.0     11.8     13.5 (582612.5 1102653)
# 5     12.1     11.9     13.6 (584612.5 1102653)
# 6     12.1     11.9     13.5 (586612.5 1102653)

Notice that the geometry column are now POINT, instead of POLYGON. This is now the centre of each polygon grid cell.

Extract specific time periods

Wrangle data to select months of interest and convert into long format for easy wrangling.

# months and years of interest
yrs <- c(1999, 2009, 2019)
mths <- c(8) # august
ym <- paste(yrs, mths, sep = "_")

# subset data based on year_month column name
tdat <- pdata %>%
  dplyr::select(all_of(ym))

# Find mean value for each time period of interest - mean of columns
sub_dat <- pacea_long(tdat)

Extract data using coordinates

We can extract the spatial data using coordinates of interest. We can use the same indexing to extract gridded cells where the coordinate data fall into. This could be used to compare the modelled BCCM data with field observations.

We’ll use the location of data buoys. There is a list of buoys and locations in this pacakge, named as object: buoy_metadata.

buoy_metadata
# # A tibble: 19 × 9
#    wmo_id name    type  latitude longitude water_depth_m col_key stn_id name_key
#    <fct>  <fct>   <fct>    <dbl>     <dbl>         <dbl> <fct>   <fct>  <fct>   
#  1 46004  Middle… NOMAD     51.0     -136.          3600 #0000FF C46004 C46004 …
#  2 46036  South … NOMAD     48.4     -134.          3500 #FF0000 C46036 C46036 …
#  3 46131  Sentry… 3 me…     49.9     -125.            18 #00FF00 C46131 C46131 …
#  4 46132  South … 3 me…     49.7     -128.          2040 #000033 C46132 C46132 …
#  5 46134  ECOBUO… 3 me…     48.7     -123.            65 #000033 C46134 C46134 …
#  6 46145  Centra… 3 me…     54.4     -132.           257 #FF00B6 C46145 C46145 …
#  7 46146  Halibu… 3 me…     49.3     -124.            43 #005300 C46146 C46146 …
#  8 46147  South … 3 me…     51.8     -131.          2000 #FFD300 C46147 C46147 …
#  9 46181  Nanakw… 3 me…     53.8     -129.            22 #009FFF C46181 C46181 …
# 10 46183  North … 3 me…     53.6     -131.            60 #9A4D42 C46183 C46183 …
# 11 46184  North … NOMAD     53.9     -139.          3200 #00FFBE C46184 C46184 …
# 12 46185  South … 3 me…     52.4     -130.           228 #783FC1 C46185 C46185 …
# 13 46204  West S… 3 me…     51.4     -129.           222 #1F9698 C46204 C46204 …
# 14 46205  West D… 3 me…     54.2     -134.          2675 #FFACFD C46205 C46205 …
# 15 46206  La Per… 3 me…     48.8     -126.            73 #FE8F42 C46206 C46206 …
# 16 46207  East D… 3 me…     50.9     -130.          2215 #DD00FF C46207 C46207 …
# 17 46208  West M… 3 me…     52.5     -133.          2950 #02AD24 C46208 C46208 …
# 18 46303  S. Geo… <NA>      49.0     -123.            NA #C8FF00 C46303 C46303 …
# 19 46304  Entran… <NA>      49.3     -123.            NA #886C00 C46304 C46304 …

# Let's use the La Perouse Bank location
lat <- buoy_metadata$latitude[which(buoy_metadata$name == "La Perouse Bank")]
lon <- buoy_metadata$longitude[which(buoy_metadata$name == "La Perouse Bank")]

# create a dataframe and convert to sf object; transform to correct projection
coords_LP <- data.frame(x = lon, y = lat)
sf_LP <- st_as_sf(coords_LP, coords = c("x", "y"), crs = "EPSG: 4326") %>% st_transform(crs = "EPSG: 3005")
sf_LP
# Simple feature collection with 1 feature and 0 fields
# Geometry type: POINT
# Dimension:     XY
# Bounding box:  xmin: 1000125 ymin: 424310.3 xmax: 1000125 ymax: 424310.3
# Projected CRS: NAD83 / BC Albers
#                   geometry
# 1 POINT (1000125 424310.3)

# extract cells for which coordinates fall into and wrangle to long format for plotting
point.dat <- pdata[sf_LP,] %>%
  pacea_long()

We can plot the time series to look at seasonal trends at this location.

point.dat %>%
  mutate(year = as.factor(year)) %>%  # set year to a factor so each line is plotted separately
  ggplot() +
  geom_line(aes(x = month, y = value, col = year)) +
  scale_y_continuous(name = attributes(pdata)$units)

Any trends are difficult to see, so let’s revise the colour scheme

point.dat  %>%
  mutate(year = as.factor(year)) %>%
  ggplot() +
  geom_line(aes(x = month, y = value, col = as.factor(year))) +
  scale_color_manual(values = colorRampPalette(c("blue", "green", "yellow"))(27), name = "year") +
  scale_y_continuous(name = attributes(pdata)$units)

Still no clear visual pattern. Let’s look at the past 10 years compared to the time series median

# calculate summary statistics (median, sd, 0.05 and 0.95 probabilities)
sum.dat <- point.dat %>%
  group_by(month) %>%
  summarise(median_val = median(value, na.rm = TRUE),
            sd_val = sd(value, na.rm = TRUE),
            q05 = quantile(value, probs = 0.05, na.rm = TRUE),
            q95 = quantile(value, probs = 0.95, na.rm = TRUE))

point.dat %>%
  filter(year >= 2010) %>%
  mutate(year = as.factor(year)) %>%
  ggplot() +
  geom_line(aes(x = month, y = value, col = as.factor(year))) +
  geom_line(data = sum.dat, aes(x = month, y = median_val), col = "black", linewidth = 2) +
  scale_color_manual(values = colorRampPalette(c("blue", "green", "yellow"))(10), name = "year") +
  scale_y_continuous(name = attributes(pdata)$units)

There is a general pattern of more recent years being slightly warmer, especially in the summer.

We can also plot the historical distribution of values and select a specific year to plot. Let’s plot 2014, when “The Blob” was around in the Pacific

# plot to view seasonal trends across years
point.dat %>%
  filter(year == 2014) %>%
  ggplot() +
  geom_line(aes(x = month, y = value), col = "red") +
  geom_line(data = sum.dat, aes(x = month, y = median_val), col = "black") +
  geom_line(data = sum.dat, aes(x = month, y = q05), col = "grey", linetype = 2) +
  geom_line(data = sum.dat, aes(x = month, y = q95), col = "grey", linetype = 2)

While not the warmest during the 1993-2019 period in the summer months, it was above the median for most of the year.

Calculating climatogy and anomaly of BCCM data

The pacea package has some built in functions to calculate climatologies and anomalies of the geospatial BCCM data. These functions can be customized to return specific data. See help files for details of functions.

First, let’s calculate a climatology, which returns data in long format.

# help file for function
# ?calc_clim

# let's use surface temperature again
pdata <- bccm_surface_temperature()

# the default years for climtatology is 1991 - 2020
# Note: there may be a warning as the BCCM data only span 1993 - 2019
clim_sst <- calc_clim(pdata)
# Warning in calc_clim(pdata): Number of years for climatology only span 27
# years:1993 to 2019

# we can also select the climatology months to be returned
clim_sst_sub <- calc_clim(pdata, time_period_return = c("Apr", "9"))
# Warning in calc_clim(pdata, time_period_return = c("Apr", "9")): Number of
# years for climatology only span 27 years:1993 to 2019

head(clim_sst)
# Simple feature collection with 6 features and 4 fields
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 575612.5 ymin: 1101653 xmax: 577612.5 ymax: 1103653
# Projected CRS: NAD83 / BC Albers
# # A tibble: 6 × 5
#   month clim_value clim_sd clim_n                                       geometry
#   <dbl>      <dbl>   <dbl>  <int>                                  <POLYGON [m]>
# 1     1       5.80   0.721     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 2     2       5.47   0.610     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 3     3       5.64   0.627     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 4     4       6.61   0.623     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 5     5       8.49   0.732     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 6     6      10.5    0.662     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
head(clim_sst_sub)
# Simple feature collection with 6 features and 4 fields
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 575612.5 ymin: 1101653 xmax: 581612.5 ymax: 1103653
# Projected CRS: NAD83 / BC Albers
# # A tibble: 6 × 5
#   month clim_value clim_sd clim_n                                       geometry
#   <dbl>      <dbl>   <dbl>  <int>                                  <POLYGON [m]>
# 1     4       6.61   0.623     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 2     9      11.6    0.596     27 ((575612.5 1103653, 577612.5 1103653, 577612.…
# 3     4       6.60   0.622     27 ((577612.5 1103653, 579612.5 1103653, 579612.…
# 4     9      11.6    0.592     27 ((577612.5 1103653, 579612.5 1103653, 579612.…
# 5     4       6.60   0.621     27 ((579612.5 1103653, 581612.5 1103653, 581612.…
# 6     9      11.6    0.590     27 ((579612.5 1103653, 581612.5 1103653, 581612.…

Next, we can calculate the anomalies of the data, relative to the climatological mean. This functions returns the data in wide format, which is the same format the input data (i.e. pdata).

# help file for function
# ?calc_anom

anom_sst <- calc_anom(pdata)
# Warning in calc_anom(pdata): Number of years for climatology only span 27
# years:1993 to 2019

head(anom_sst)
# Simple feature collection with 6 features and 324 fields
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 575612.5 ymin: 1101653 xmax: 587612.5 ymax: 1103653
# Projected CRS: NAD83 / BC Albers
# # A tibble: 6 × 325
#   `1993_1` `1993_2` `1993_3` `1993_4` `1993_5` `1993_6` `1993_7` `1993_8`
#      <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
# 1   -0.521   -0.122  -0.0780    0.167    0.845     1.32    0.437    0.449
# 2   -0.516   -0.128  -0.0804    0.156    0.834     1.32    0.437    0.435
# 3   -0.492   -0.147  -0.0896    0.141    0.805     1.32    0.429    0.383
# 4   -0.474   -0.171  -0.102     0.124    0.796     1.34    0.413    0.345
# 5   -0.462   -0.184  -0.0919    0.138    0.811     1.36    0.402    0.336
# 6   -0.472   -0.209  -0.0859    0.139    0.819     1.36    0.393    0.339
# # ℹ 317 more variables: `1993_9` <dbl>, `1993_10` <dbl>, `1993_11` <dbl>,
# #   `1993_12` <dbl>, `1994_1` <dbl>, `1994_2` <dbl>, `1994_3` <dbl>,
# #   `1994_4` <dbl>, `1994_5` <dbl>, `1994_6` <dbl>, `1994_7` <dbl>,
# #   `1994_8` <dbl>, `1994_9` <dbl>, `1994_10` <dbl>, `1994_11` <dbl>,
# #   `1994_12` <dbl>, `1995_1` <dbl>, `1995_2` <dbl>, `1995_3` <dbl>,
# #   `1995_4` <dbl>, `1995_5` <dbl>, `1995_6` <dbl>, `1995_7` <dbl>,
# #   `1995_8` <dbl>, `1995_9` <dbl>, `1995_10` <dbl>, `1995_11` <dbl>, …

Arguments for the anomaly function allow users to specify the climatological reference period and the months and/or years to return as anomaly data

anom_sst_sub <- calc_anom(pdata, clim_years = c(1993:2019),
                          time_period_return = c("Apr", "9"),
                          years_return = c(2010, 2019))

head(anom_sst_sub)
# Simple feature collection with 6 features and 4 fields
# Geometry type: POLYGON
# Dimension:     XY
# Bounding box:  xmin: 575612.5 ymin: 1101653 xmax: 587612.5 ymax: 1103653
# Projected CRS: NAD83 / BC Albers
# # A tibble: 6 × 5
#   `2010_4` `2010_9` `2019_4` `2019_9`                                   geometry
#      <dbl>    <dbl>    <dbl>    <dbl>                              <POLYGON [m]>
# 1    0.114   -0.821    0.726     1.27 ((575612.5 1103653, 577612.5 1103653, 577…
# 2    0.124   -0.808    0.734     1.28 ((577612.5 1103653, 579612.5 1103653, 579…
# 3    0.150   -0.792    0.727     1.30 ((579612.5 1103653, 581612.5 1103653, 581…
# 4    0.204   -0.795    0.699     1.28 ((581612.5 1103653, 583612.5 1103653, 583…
# 5    0.204   -0.785    0.699     1.28 ((583612.5 1103653, 585612.5 1103653, 585…
# 6    0.244   -0.807    0.676     1.26 ((585612.5 1103653, 587612.5 1103653, 587…

Anomaly data can also be plotted using built in generic functions in pacea.

# plotting the subsetted data will be quicker
plot(anom_sst_sub)

The plotting default is for the most recent time period in the data available. We can also specify the months and years to plot as a facet plot.

plot(anom_sst_sub,
     months.plot = c("Apr", "Sep"),
     years.plot = c(2010, 2019))

We have also incorporated the ability to add climatological data to the plot, which creates contour lines showing the anomaly areas that are above (or below for other variables) the 90th and 99th percentile of data.

# there may be warnings that arise from the interpolation of the contour lines
plot(anom_sst_sub,
     months.plot = c("Apr", "Sep"),
     years.plot = c(2010, 2019),
     clim.dat = clim_sst_sub)

Estimate mean (median, sd, etc.) across entire spatial area

Mean of an entire area can also be estimated using the area_mean() function from ‘pacea’. The data entered must be an ‘sf’ object and contain only one column of values - this column should be a single spatial layer (i.e. one time period) with the spatial cells of interest (e.g. area 126).

The function will provide the summary statistics (mean, median, sd, n) across the entire area, as well as for inshore areas to the 200m isobath and offshore areas from the 200m isobath (i.e. continental shelf).

# use pdata from above and subset only one column
pdata_areamean <- pdata[,1]

# area mean
area_mean(pdata_areamean)
# $`All area`
#       mean   median        sd     n
# 1 7.161521 7.170706 0.8219214 40580
# 
# $`Inshore area`
#       mean  median        sd     n
# 1 6.705519 6.82011 0.7631445 20064
# 
# $`Offshore area`
#       mean   median        sd     n
# 1 7.607477 7.583088 0.6036946 20516

To see where the 200m isobath line, it can be plotted along with the BC coast and EEZ shapefile.

ggplot() +
  geom_sf(data = bc_eez, linetype = 2, fill = NA) +
  geom_sf(data = bc_coast) +
  geom_sf(data = isobath_200m, col = "red")

References

Peña, M.A., Fine, I. and Callendar, W. 2019. Interannual variability in primary production and shelf-offshore transport of nutrients along the northeast Pacific Ocean margin. Deep-Sea Research II, doi:10.1016/j.dsr2.2019.104637.