Optimal Interoplation Sea Surface Temperature (OISST)

The optimal interpolation sea surface temperature were obtained from NOAA’s (National Oceanic and Atmospheric Administration) ERDDAP data server website: ‘https://coastwatch.pfeg.noaa.gov/erddap/files/’ (Huang et al., 2021). The quality-checked data ranges from 1981/09/01 to about 2 weeks before present - more recent data (only 2-day lag) is available but is only preliminary and has yet to be checked for quality.

The details of the data as described by NOAA: “The NOAA 1/4° Daily Optimum Interpolation Sea Surface Temperature (OISST) is a long term Climate Data Record that incorporates observations from different platforms (satellites, ships, buoys and Argo floats) into a regular global grid. The dataset is interpolated to fill gaps on the grid and create a spatially complete map of sea surface temperature. Satellite and ship observations are referenced to buoys to compensate for platform differences and sensor biases.”

More details on the OISST data can be found here: ‘https://www.ncei.noaa.gov/products/optimum-interpolation-sst’.

OISST data in pacea

The OISST data were downloaded using the rerddap package, then wrangled similarly to what is found on this GitHub repository: ‘https://github.com/IOS-OSD-DPG/Pacific_SST_Monitoring#oisst’.

This GitHub repository, by Andrea Hilborn, Charles Hannah and Lu Guan, is automatically updated approximately every week, which includes up-to-date 7-day SST means, anomalies, and recent marine heatwave status - among other data visualization products. Visit their GitHub page for a glance at recent conditions. We have adapted the code by Andrea Hilborn (found in their GitHub repository) to include the data in pacea.

We wrangled the SST data to produce 2 simple feature sf products: 7-day mean, and monthly mean. The data are kept in the original spatial 1/4° longitude x 1/4° latitude grid and WGS 84 projection, and masked to show only the BC Exclusive Economic Zone area. The coordinates in the geometry column of the sf objects are the centroid of each grid cell.

See ?oisst_7day for the name of OISST data products and more information in the R help file.

There are plans to update the data monthly, therefore, users will have access to data up to the previous month from the current month. Let’s explore the OISST data in ‘pacea’.

library(pacea)
library(dplyr)
library(sf)
library(ggplot2)

The OISST data are saved in pacea. In addition to the mean temperature value (weekly or monthly mean) for each grid cell, there are also other statistical measures, such as the standard deviation and number of observations. There is also a start- and end-date column that indicates the temporal period for which the mean is calculated from.

oisst_7day
## Simple feature collection with 2063675 features and 7 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -138.625 ymin: 46.625 xmax: -123.125 ymax: 54.625
## Geodetic CRS:  WGS 84
## # A tibble: 2,063,675 × 8
##     year  week   sst sst_sd sst_n start_date end_date  
##  * <dbl> <dbl> <dbl>  <dbl> <int> <date>     <date>    
##  1  1981    35  15.8 0.0141     2 1981-09-01 1981-09-02
##  2  1981    36  17.3 0.571      7 1981-09-03 1981-09-09
##  3  1981    37  16.9 0.434      7 1981-09-10 1981-09-16
##  4  1981    38  16.2 0.305      7 1981-09-17 1981-09-23
##  5  1981    39  15.4 0.201      7 1981-09-24 1981-09-30
##  6  1981    40  15.1 0.286      7 1981-10-01 1981-10-07
##  7  1981    41  14.5 0.409      7 1981-10-08 1981-10-14
##  8  1981    42  14.4 0.755      7 1981-10-15 1981-10-21
##  9  1981    43  13.2 0.758      7 1981-10-22 1981-10-28
## 10  1981    44  14.0 0.221      7 1981-10-29 1981-11-04
## # ℹ 2,063,665 more rows
## # ℹ 1 more variable: geometry <POINT [°]>
oisst_month
## Simple feature collection with 467125 features and 7 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -138.625 ymin: 46.625 xmax: -123.125 ymax: 54.625
## Geodetic CRS:  WGS 84
## # A tibble: 467,125 × 8
##     year month   sst sst_sd sst_n start_date end_date  
##  * <dbl> <dbl> <dbl>  <dbl> <int> <date>     <date>    
##  1  1981     9  16.4  0.830    30 1981-09-01 1981-09-30
##  2  1981    10  14.3  0.838    31 1981-10-01 1981-10-31
##  3  1981    11  12.7  1.38     30 1981-11-01 1981-11-30
##  4  1981    12  10.2  0.549    31 1981-12-01 1981-12-31
##  5  1981     9  16.4  1.05     30 1981-09-01 1981-09-30
##  6  1981    10  14.2  0.697    31 1981-10-01 1981-10-31
##  7  1981    11  12.4  0.851    30 1981-11-01 1981-11-30
##  8  1981    12  10.0  0.837    31 1981-12-01 1981-12-31
##  9  1981     9  16.4  0.941    30 1981-09-01 1981-09-30
## 10  1981    10  14.3  0.694    31 1981-10-01 1981-10-31
## # ℹ 467,115 more rows
## # ℹ 1 more variable: geometry <POINT [°]>

Plotting OISST data

We have built a generic plot function for OISST data, which uses ggplot2 functions to create multipanel geospatial figures. Let’s use the monthly OISST data.

# the default is to provide the current or most recent data available
plot(oisst_month)

We can select multiple months and/or years to plot.

plot(oisst_month, months.plot = c("Apr", "Sep"), years.plot = c(1982, 2002, 2022))

This can also be done for the weekly data, where you can specify the specific week(s) to plot.

plot(oisst_7day, weeks.plot = c(20, 35), years.plot = c(1983, 2003, 2020))

Knowing which week of interest may be difficult. Therefore, you can also convert a Date object into the week interested.

# a random year with the date we are interested in to convert to the week of the year
my.date <- as.Date("2015-09-04")
lubridate::week(my.date)
## [1] 36
plot(oisst_7day, weeks.plot = lubridate::week(my.date), years.plot = c(1983, 2003, 2020))

Wrangling OISST data

Masking data with shapefile

(adated from example vignette with BCCM data)

We can select an area of interest by masking the data with another sf shapefile, such as for a fishing region using indexing syntax. Below is an example using coordinates from DFO fish area 126.

# coordinates for polygon (i.e. fishing area 126)
crds <- list(matrix(c(-127.1506, -128.2331, -129.3492, -127.9167, -127.1847, -126.8200, -127.1506,
                      49.85766, 49.00000, 48.99991, 50.11915, 50.40183, 50.24466, 49.85766),
                    ncol = 2))

# create polygon object with lat-lon WGS 84 projection (4326)
a126 <- st_sfc(st_polygon(crds), crs = 4326) %>%
  st_as_sf()

# mask OISST data by indexing sf object with area 126 shapefile
reg.dat <- oisst_month[a126, ]

# reduction in number of data points after subsetting the region
dim(oisst_month)
## [1] 467125      8
dim(reg.dat)
## [1] 10100     8

To plot these subsetted data, we must reassign the class to the object for the pacea generic plot function to operate.

class(reg.dat) <- class(oisst_month)
plot(reg.dat, months.plot = c("Apr", "Sep"), years.plot = c(1982, 2002, 2022))

Otherwise we can build our own ggplot.

reg.dat %>%
  bind_cols(st_coordinates(reg.dat)) %>% # get coordinates out for plotting with geom_tile
  filter(year %in% c(1982, 2002, 2022),  # filter out months and years
         month %in% c(4, 9)) %>%
  ggplot() +
  geom_tile(aes(x = X, y= Y, fill = sst)) +
  scale_fill_gradientn(colours = c("blue", "grey", "red"), name = attributes(oisst_month)$units) +
  facet_grid(month~year) +
  geom_sf(data = bc_coast)

Extract data using coordinates

(adated from example vignette with BCCM data)

We can extract the spatial data using coordinates of interest. However, because the OISST are coordinates, we must use the st_nearest_point() function to find the data which are closest to our coordinates of interest.

We’ll use the location of data buoys. There is a list of buoys and locations in this pacakge, named as object: buoy_metadata.

buoy_metadata
## # A tibble: 19 × 9
##    wmo_id name    type  latitude longitude water_depth_m col_key stn_id name_key
##    <fct>  <fct>   <fct>    <dbl>     <dbl>         <dbl> <fct>   <fct>  <fct>   
##  1 46004  Middle… NOMAD     51.0     -136.          3600 #0000FF C46004 C46004 …
##  2 46036  South … NOMAD     48.4     -134.          3500 #FF0000 C46036 C46036 …
##  3 46131  Sentry… 3 me…     49.9     -125.            18 #00FF00 C46131 C46131 …
##  4 46132  South … 3 me…     49.7     -128.          2040 #000033 C46132 C46132 …
##  5 46134  ECOBUO… 3 me…     48.7     -123.            65 #000033 C46134 C46134 …
##  6 46145  Centra… 3 me…     54.4     -132.           257 #FF00B6 C46145 C46145 …
##  7 46146  Halibu… 3 me…     49.3     -124.            43 #005300 C46146 C46146 …
##  8 46147  South … 3 me…     51.8     -131.          2000 #FFD300 C46147 C46147 …
##  9 46181  Nanakw… 3 me…     53.8     -129.            22 #009FFF C46181 C46181 …
## 10 46183  North … 3 me…     53.6     -131.            60 #9A4D42 C46183 C46183 …
## 11 46184  North … NOMAD     53.9     -139.          3200 #00FFBE C46184 C46184 …
## 12 46185  South … 3 me…     52.4     -130.           228 #783FC1 C46185 C46185 …
## 13 46204  West S… 3 me…     51.4     -129.           222 #1F9698 C46204 C46204 …
## 14 46205  West D… 3 me…     54.2     -134.          2675 #FFACFD C46205 C46205 …
## 15 46206  La Per… 3 me…     48.8     -126.            73 #FE8F42 C46206 C46206 …
## 16 46207  East D… 3 me…     50.9     -130.          2215 #DD00FF C46207 C46207 …
## 17 46208  West M… 3 me…     52.5     -133.          2950 #02AD24 C46208 C46208 …
## 18 46303  S. Geo… <NA>      49.0     -123.            NA #C8FF00 C46303 C46303 …
## 19 46304  Entran… <NA>      49.3     -123.            NA #886C00 C46304 C46304 …
# Let's use the La Perouse Bank location
lat <- buoy_metadata$latitude[which(buoy_metadata$name == "La Perouse Bank")]
lon <- buoy_metadata$longitude[which(buoy_metadata$name == "La Perouse Bank")]

# create a dataframe and convert to sf object
coords_LP <- data.frame(x = lon, y = lat)
sf_LP <- st_as_sf(coords_LP, coords = c("x", "y"), crs = "EPSG: 4326")
sf_LP
## Simple feature collection with 1 feature and 0 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -125.9983 ymin: 48.835 xmax: -125.9983 ymax: 48.835
## Geodetic CRS:  WGS 84
##                   geometry
## 1 POINT (-125.9983 48.835)
st <- Sys.time()

# distance from buoy to each coordinate
dist <- st_distance(oisst_month, sf_LP)

# subset data for points that are closest to buoy
sub.dat <- oisst_month[which(dist == min(dist)),]

Now we can plot the time series for the data we’ve extracted

sub.dat %>%
  mutate(year = as.factor(year)) %>%  # set year to a factor so each line is plotted separately
  ggplot() +
  geom_line(aes(x = month, y = sst, col = year)) +
  scale_y_continuous(name = attributes(oisst_month)$units)

To see where we are this year, let’s plot just this year’s data, with the historical distribution.

# calculate summary statistics (median, sd, 0.05 and 0.95 probabilities)
sum.dat <- sub.dat %>%
  group_by(month) %>%
  summarise(median_val = median(sst, na.rm = TRUE),
            sd_val = sd(sst, na.rm = TRUE),
            q05 = quantile(sst, probs = 0.05, na.rm = TRUE),
            q95 = quantile(sst, probs = 0.95, na.rm = TRUE))

sub.dat %>%
  filter(year == 2023) %>%
  mutate(year = as.factor(year)) %>%
  ggplot() +
  geom_ribbon(data = sum.dat, aes(x = month, ymin = q05, ymax = q95), fill = "grey") +
  geom_line(aes(x = month, y = sst), col = "red") +
  geom_line(data = sum.dat, aes(x = month, y = median_val), col = "black", linewidth = 1) +
  scale_y_continuous(name = attributes(oisst_month)$units)

Climatology and anomalies

The pacea package has some built in functions to calculate climatologies and anomalies of the OISST data. These functions can be customized to return specific data. See help files for details of functions.

First, let’s calculate a climatology using the weekly ‘oisst_7day’ data.

# help file for function
# ?calc_clim

# the default years for climtatology is 1991 - 2020
clim_sst <- calc_clim(oisst_7day)

head(clim_sst)
## Simple feature collection with 6 features and 4 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -129.625 ymin: 46.625 xmax: -128.625 ymax: 46.875
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 5
##    week clim_value clim_sd clim_n          geometry
##   <dbl>      <dbl>   <dbl>  <int>       <POINT [°]>
## 1     1       9.88   0.877     30 (-129.125 46.625)
## 2     1       9.71   0.919     30 (-129.625 46.875)
## 3     1       9.73   0.912     30 (-129.375 46.875)
## 4     1       9.73   0.885     30 (-129.125 46.875)
## 5     1       9.71   0.843     30 (-128.875 46.875)
## 6     1       9.68   0.790     30 (-128.625 46.875)
# we can also select the climatology weeks to be returned: either a numeric value or convert a date to the week of the year
clim_sst_sub <- calc_clim(oisst_7day, time_period_return = c(15, lubridate::week(as.Date("2010-08-01"))))

head(clim_sst_sub)
## Simple feature collection with 6 features and 4 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -129.625 ymin: 46.625 xmax: -128.625 ymax: 46.875
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 5
##    week clim_value clim_sd clim_n          geometry
##   <dbl>      <dbl>   <dbl>  <int>       <POINT [°]>
## 1    15       9.08   0.914     30 (-129.125 46.625)
## 2    15       8.94   0.895     30 (-129.625 46.875)
## 3    15       8.96   0.913     30 (-129.375 46.875)
## 4    15       8.99   0.910     30 (-129.125 46.875)
## 5    15       9.03   0.919     30 (-128.875 46.875)
## 6    15       9.05   0.919     30 (-128.625 46.875)

Next, we can calculate the anomalies of the data, relative to the climatological mean.

# help file for function
# ?calc_anom

anom_sst <- calc_anom(oisst_7day)

head(anom_sst)
## Simple feature collection with 6 features and 7 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -129.125 ymin: 46.625 xmax: -129.125 ymax: 46.625
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 8
##    year  week   sst start_date end_date   clim_value    anom
##   <dbl> <dbl> <dbl> <date>     <date>          <dbl>   <dbl>
## 1  1981    35  15.8 1981-09-01 1981-09-02       16.9 -1.01  
## 2  1981    36  17.3 1981-09-03 1981-09-09       16.9  0.416 
## 3  1981    37  16.9 1981-09-10 1981-09-16       17.0 -0.0545
## 4  1981    38  16.2 1981-09-17 1981-09-23       16.7 -0.576 
## 5  1981    39  15.4 1981-09-24 1981-09-30       16.4 -1.06  
## 6  1981    40  15.1 1981-10-01 1981-10-07       16.1 -1.05  
## # ℹ 1 more variable: geometry <POINT [°]>

Arguments for the anomaly function allow users to specify the climatological reference period and the months/weeks and/or years to return as anomaly data

anom_sst_sub <- calc_anom(oisst_7day, time_period_return = c(15, lubridate::week(as.Date("2010-08-01"))), years_return = c(2010, 2019))

head(anom_sst_sub)
## Simple feature collection with 6 features and 7 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -129.625 ymin: 46.625 xmax: -129.125 ymax: 46.875
## Geodetic CRS:  WGS 84
## # A tibble: 6 × 8
##    year  week   sst start_date end_date   clim_value   anom
##   <dbl> <dbl> <dbl> <date>     <date>          <dbl>  <dbl>
## 1  2010    15  8.43 2010-04-09 2010-04-15       9.08 -0.651
## 2  2010    31 14.1  2010-07-30 2010-08-05      15.8  -1.69 
## 3  2010    15  8.20 2010-04-09 2010-04-15       8.94 -0.736
## 4  2010    31 14.0  2010-07-30 2010-08-05      15.6  -1.61 
## 5  2010    15  8.27 2010-04-09 2010-04-15       8.96 -0.690
## 6  2010    31 14.1  2010-07-30 2010-08-05      15.7  -1.58 
## # ℹ 1 more variable: geometry <POINT [°]>

Anomaly data can also be plotted using built in generic functions in pacea.

# plotting the subsetted data will be quicker
plot(anom_sst_sub)

The plotting default is for the most recent time period in the data available. We can also specify the weeks and years to plot as a facet plot. If using the ‘oisst_month’ data, use the months.plot argument.

plot(anom_sst_sub, weeks.plot = c(15, lubridate::week(as.Date("2010-08-01"))), years.plot = c(2010, 2019))

We have also incorporated the ability to add climatological data to the plot, which creates contour lines showing the anomaly areas that are above (or below for other variables) the 90th and 99th percentile of data (99th percentile contours may not appear if data values do not exceed that value).

# there may be warnings that arise from the interpolation of the contour lines
plot(anom_sst_sub, weeks.plot = c(15, lubridate::week(as.Date("2010-08-01"))), years.plot = c(2010, 2019), clim.dat = clim_sst_sub)

References

Huang, B., Liu, C., Banzon, V., Freeman, E., Graham, G., Hankins, B., Smith, T., Zhang, H. M., 2021. Improvements of the daily optimum interpolation Sea Surface temperature (DOISST) version 2.1. J. Clim. 34 (8), 2923–2939.