The optimal interpolation sea surface temperature were obtained from NOAA’s (National Oceanic and Atmospheric Administration) ERDDAP data server website: ‘https://coastwatch.pfeg.noaa.gov/erddap/files/’ (Huang et al., 2021). The quality-checked data ranges from 1981/09/01 to about 2 weeks before present - more recent data (only 2-day lag) is available but is only preliminary and has yet to be checked for quality.
The details of the data as described by NOAA: “The NOAA 1/4° Daily Optimum Interpolation Sea Surface Temperature (OISST) is a long term Climate Data Record that incorporates observations from different platforms (satellites, ships, buoys and Argo floats) into a regular global grid. The dataset is interpolated to fill gaps on the grid and create a spatially complete map of sea surface temperature. Satellite and ship observations are referenced to buoys to compensate for platform differences and sensor biases.”
More details on the OISST data can be found here: ‘https://www.ncei.noaa.gov/products/optimum-interpolation-sst’.
paceaThe OISST data were downloaded using the rerddap package, then wrangled similarly to what is found on this GitHub repository: ‘https://github.com/IOS-OSD-DPG/Pacific_SST_Monitoring#oisst’.
This GitHub repository, by Andrea Hilborn, Charles Hannah and Lu Guan, is automatically updated approximately every week, which includes up-to-date 7-day SST means, anomalies, and recent marine heatwave status - among other data visualization products. Visit their GitHub page for a glance at recent conditions. We have adapted the code by Andrea Hilborn (found in their GitHub repository) to include the data in pacea.
We wrangled the SST data to produce 2 simple feature sf products: 7-day mean, and monthly mean. The data are kept in the original spatial 1/4° longitude x 1/4° latitude grid and WGS 84 projection, and masked to show only the BC Exclusive Economic Zone area. The coordinates in the geometry column of the sf objects are the centroid of each grid cell.
See ?oisst_7day for the name of OISST data products and more information in the R help file.
There are plans to update the data monthly, therefore, users will have access to data up to the previous month from the current month. Let’s explore the OISST data in ‘pacea’.
library(pacea)
library(dplyr)
library(sf)
library(ggplot2)
The OISST data are saved in pacea. In addition to the mean temperature value (weekly or monthly mean) for each grid cell, there are also other statistical measures, such as the standard deviation and number of observations. There is also a start- and end-date column that indicates the temporal period for which the mean is calculated from.
oisst_7day
## Simple feature collection with 2063675 features and 7 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -138.625 ymin: 46.625 xmax: -123.125 ymax: 54.625
## Geodetic CRS: WGS 84
## # A tibble: 2,063,675 × 8
## year week sst sst_sd sst_n start_date end_date
## * <dbl> <dbl> <dbl> <dbl> <int> <date> <date>
## 1 1981 35 15.8 0.0141 2 1981-09-01 1981-09-02
## 2 1981 36 17.3 0.571 7 1981-09-03 1981-09-09
## 3 1981 37 16.9 0.434 7 1981-09-10 1981-09-16
## 4 1981 38 16.2 0.305 7 1981-09-17 1981-09-23
## 5 1981 39 15.4 0.201 7 1981-09-24 1981-09-30
## 6 1981 40 15.1 0.286 7 1981-10-01 1981-10-07
## 7 1981 41 14.5 0.409 7 1981-10-08 1981-10-14
## 8 1981 42 14.4 0.755 7 1981-10-15 1981-10-21
## 9 1981 43 13.2 0.758 7 1981-10-22 1981-10-28
## 10 1981 44 14.0 0.221 7 1981-10-29 1981-11-04
## # ℹ 2,063,665 more rows
## # ℹ 1 more variable: geometry <POINT [°]>
oisst_month
## Simple feature collection with 467125 features and 7 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -138.625 ymin: 46.625 xmax: -123.125 ymax: 54.625
## Geodetic CRS: WGS 84
## # A tibble: 467,125 × 8
## year month sst sst_sd sst_n start_date end_date
## * <dbl> <dbl> <dbl> <dbl> <int> <date> <date>
## 1 1981 9 16.4 0.830 30 1981-09-01 1981-09-30
## 2 1981 10 14.3 0.838 31 1981-10-01 1981-10-31
## 3 1981 11 12.7 1.38 30 1981-11-01 1981-11-30
## 4 1981 12 10.2 0.549 31 1981-12-01 1981-12-31
## 5 1981 9 16.4 1.05 30 1981-09-01 1981-09-30
## 6 1981 10 14.2 0.697 31 1981-10-01 1981-10-31
## 7 1981 11 12.4 0.851 30 1981-11-01 1981-11-30
## 8 1981 12 10.0 0.837 31 1981-12-01 1981-12-31
## 9 1981 9 16.4 0.941 30 1981-09-01 1981-09-30
## 10 1981 10 14.3 0.694 31 1981-10-01 1981-10-31
## # ℹ 467,115 more rows
## # ℹ 1 more variable: geometry <POINT [°]>
We have built a generic plot function for OISST data, which uses ggplot2 functions to create multipanel geospatial figures. Let’s use the monthly OISST data.
# the default is to provide the current or most recent data available
plot(oisst_month)
We can select multiple months and/or years to plot.
plot(oisst_month, months.plot = c("Apr", "Sep"), years.plot = c(1982, 2002, 2022))
This can also be done for the weekly data, where you can specify the specific week(s) to plot.
plot(oisst_7day, weeks.plot = c(20, 35), years.plot = c(1983, 2003, 2020))
Knowing which week of interest may be difficult. Therefore, you can also convert a Date object into the week interested.
# a random year with the date we are interested in to convert to the week of the year
my.date <- as.Date("2015-09-04")
lubridate::week(my.date)
## [1] 36
plot(oisst_7day, weeks.plot = lubridate::week(my.date), years.plot = c(1983, 2003, 2020))
(adated from example vignette with BCCM data)
We can select an area of interest by masking the data with another sf shapefile, such as for a fishing region using indexing syntax. Below is an example using coordinates from DFO fish area 126.
# coordinates for polygon (i.e. fishing area 126)
crds <- list(matrix(c(-127.1506, -128.2331, -129.3492, -127.9167, -127.1847, -126.8200, -127.1506,
49.85766, 49.00000, 48.99991, 50.11915, 50.40183, 50.24466, 49.85766),
ncol = 2))
# create polygon object with lat-lon WGS 84 projection (4326)
a126 <- st_sfc(st_polygon(crds), crs = 4326) %>%
st_as_sf()
# mask OISST data by indexing sf object with area 126 shapefile
reg.dat <- oisst_month[a126, ]
# reduction in number of data points after subsetting the region
dim(oisst_month)
## [1] 467125 8
dim(reg.dat)
## [1] 10100 8
To plot these subsetted data, we must reassign the class to the object for the pacea generic plot function to operate.
class(reg.dat) <- class(oisst_month)
plot(reg.dat, months.plot = c("Apr", "Sep"), years.plot = c(1982, 2002, 2022))
Otherwise we can build our own ggplot.
reg.dat %>%
bind_cols(st_coordinates(reg.dat)) %>% # get coordinates out for plotting with geom_tile
filter(year %in% c(1982, 2002, 2022), # filter out months and years
month %in% c(4, 9)) %>%
ggplot() +
geom_tile(aes(x = X, y= Y, fill = sst)) +
scale_fill_gradientn(colours = c("blue", "grey", "red"), name = attributes(oisst_month)$units) +
facet_grid(month~year) +
geom_sf(data = bc_coast)
(adated from example vignette with BCCM data)
We can extract the spatial data using coordinates of interest. However, because the OISST are coordinates, we must use the st_nearest_point() function to find the data which are closest to our coordinates of interest.
We’ll use the location of data buoys. There is a list of buoys and locations in this pacakge, named as object: buoy_metadata.
buoy_metadata
## # A tibble: 19 × 9
## wmo_id name type latitude longitude water_depth_m col_key stn_id name_key
## <fct> <fct> <fct> <dbl> <dbl> <dbl> <fct> <fct> <fct>
## 1 46004 Middle… NOMAD 51.0 -136. 3600 #0000FF C46004 C46004 …
## 2 46036 South … NOMAD 48.4 -134. 3500 #FF0000 C46036 C46036 …
## 3 46131 Sentry… 3 me… 49.9 -125. 18 #00FF00 C46131 C46131 …
## 4 46132 South … 3 me… 49.7 -128. 2040 #000033 C46132 C46132 …
## 5 46134 ECOBUO… 3 me… 48.7 -123. 65 #000033 C46134 C46134 …
## 6 46145 Centra… 3 me… 54.4 -132. 257 #FF00B6 C46145 C46145 …
## 7 46146 Halibu… 3 me… 49.3 -124. 43 #005300 C46146 C46146 …
## 8 46147 South … 3 me… 51.8 -131. 2000 #FFD300 C46147 C46147 …
## 9 46181 Nanakw… 3 me… 53.8 -129. 22 #009FFF C46181 C46181 …
## 10 46183 North … 3 me… 53.6 -131. 60 #9A4D42 C46183 C46183 …
## 11 46184 North … NOMAD 53.9 -139. 3200 #00FFBE C46184 C46184 …
## 12 46185 South … 3 me… 52.4 -130. 228 #783FC1 C46185 C46185 …
## 13 46204 West S… 3 me… 51.4 -129. 222 #1F9698 C46204 C46204 …
## 14 46205 West D… 3 me… 54.2 -134. 2675 #FFACFD C46205 C46205 …
## 15 46206 La Per… 3 me… 48.8 -126. 73 #FE8F42 C46206 C46206 …
## 16 46207 East D… 3 me… 50.9 -130. 2215 #DD00FF C46207 C46207 …
## 17 46208 West M… 3 me… 52.5 -133. 2950 #02AD24 C46208 C46208 …
## 18 46303 S. Geo… <NA> 49.0 -123. NA #C8FF00 C46303 C46303 …
## 19 46304 Entran… <NA> 49.3 -123. NA #886C00 C46304 C46304 …
# Let's use the La Perouse Bank location
lat <- buoy_metadata$latitude[which(buoy_metadata$name == "La Perouse Bank")]
lon <- buoy_metadata$longitude[which(buoy_metadata$name == "La Perouse Bank")]
# create a dataframe and convert to sf object
coords_LP <- data.frame(x = lon, y = lat)
sf_LP <- st_as_sf(coords_LP, coords = c("x", "y"), crs = "EPSG: 4326")
sf_LP
## Simple feature collection with 1 feature and 0 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -125.9983 ymin: 48.835 xmax: -125.9983 ymax: 48.835
## Geodetic CRS: WGS 84
## geometry
## 1 POINT (-125.9983 48.835)
st <- Sys.time()
# distance from buoy to each coordinate
dist <- st_distance(oisst_month, sf_LP)
# subset data for points that are closest to buoy
sub.dat <- oisst_month[which(dist == min(dist)),]
Now we can plot the time series for the data we’ve extracted
sub.dat %>%
mutate(year = as.factor(year)) %>% # set year to a factor so each line is plotted separately
ggplot() +
geom_line(aes(x = month, y = sst, col = year)) +
scale_y_continuous(name = attributes(oisst_month)$units)
To see where we are this year, let’s plot just this year’s data, with the historical distribution.
# calculate summary statistics (median, sd, 0.05 and 0.95 probabilities)
sum.dat <- sub.dat %>%
group_by(month) %>%
summarise(median_val = median(sst, na.rm = TRUE),
sd_val = sd(sst, na.rm = TRUE),
q05 = quantile(sst, probs = 0.05, na.rm = TRUE),
q95 = quantile(sst, probs = 0.95, na.rm = TRUE))
sub.dat %>%
filter(year == 2023) %>%
mutate(year = as.factor(year)) %>%
ggplot() +
geom_ribbon(data = sum.dat, aes(x = month, ymin = q05, ymax = q95), fill = "grey") +
geom_line(aes(x = month, y = sst), col = "red") +
geom_line(data = sum.dat, aes(x = month, y = median_val), col = "black", linewidth = 1) +
scale_y_continuous(name = attributes(oisst_month)$units)
The pacea package has some built in functions to calculate climatologies and anomalies of the OISST data. These functions can be customized to return specific data. See help files for details of functions.
First, let’s calculate a climatology using the weekly ‘oisst_7day’ data.
# help file for function
# ?calc_clim
# the default years for climtatology is 1991 - 2020
clim_sst <- calc_clim(oisst_7day)
head(clim_sst)
## Simple feature collection with 6 features and 4 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -129.625 ymin: 46.625 xmax: -128.625 ymax: 46.875
## Geodetic CRS: WGS 84
## # A tibble: 6 × 5
## week clim_value clim_sd clim_n geometry
## <dbl> <dbl> <dbl> <int> <POINT [°]>
## 1 1 9.88 0.877 30 (-129.125 46.625)
## 2 1 9.71 0.919 30 (-129.625 46.875)
## 3 1 9.73 0.912 30 (-129.375 46.875)
## 4 1 9.73 0.885 30 (-129.125 46.875)
## 5 1 9.71 0.843 30 (-128.875 46.875)
## 6 1 9.68 0.790 30 (-128.625 46.875)
# we can also select the climatology weeks to be returned: either a numeric value or convert a date to the week of the year
clim_sst_sub <- calc_clim(oisst_7day, time_period_return = c(15, lubridate::week(as.Date("2010-08-01"))))
head(clim_sst_sub)
## Simple feature collection with 6 features and 4 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -129.625 ymin: 46.625 xmax: -128.625 ymax: 46.875
## Geodetic CRS: WGS 84
## # A tibble: 6 × 5
## week clim_value clim_sd clim_n geometry
## <dbl> <dbl> <dbl> <int> <POINT [°]>
## 1 15 9.08 0.914 30 (-129.125 46.625)
## 2 15 8.94 0.895 30 (-129.625 46.875)
## 3 15 8.96 0.913 30 (-129.375 46.875)
## 4 15 8.99 0.910 30 (-129.125 46.875)
## 5 15 9.03 0.919 30 (-128.875 46.875)
## 6 15 9.05 0.919 30 (-128.625 46.875)
Next, we can calculate the anomalies of the data, relative to the climatological mean.
# help file for function
# ?calc_anom
anom_sst <- calc_anom(oisst_7day)
head(anom_sst)
## Simple feature collection with 6 features and 7 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -129.125 ymin: 46.625 xmax: -129.125 ymax: 46.625
## Geodetic CRS: WGS 84
## # A tibble: 6 × 8
## year week sst start_date end_date clim_value anom
## <dbl> <dbl> <dbl> <date> <date> <dbl> <dbl>
## 1 1981 35 15.8 1981-09-01 1981-09-02 16.9 -1.01
## 2 1981 36 17.3 1981-09-03 1981-09-09 16.9 0.416
## 3 1981 37 16.9 1981-09-10 1981-09-16 17.0 -0.0545
## 4 1981 38 16.2 1981-09-17 1981-09-23 16.7 -0.576
## 5 1981 39 15.4 1981-09-24 1981-09-30 16.4 -1.06
## 6 1981 40 15.1 1981-10-01 1981-10-07 16.1 -1.05
## # ℹ 1 more variable: geometry <POINT [°]>
Arguments for the anomaly function allow users to specify the climatological reference period and the months/weeks and/or years to return as anomaly data
anom_sst_sub <- calc_anom(oisst_7day, time_period_return = c(15, lubridate::week(as.Date("2010-08-01"))), years_return = c(2010, 2019))
head(anom_sst_sub)
## Simple feature collection with 6 features and 7 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -129.625 ymin: 46.625 xmax: -129.125 ymax: 46.875
## Geodetic CRS: WGS 84
## # A tibble: 6 × 8
## year week sst start_date end_date clim_value anom
## <dbl> <dbl> <dbl> <date> <date> <dbl> <dbl>
## 1 2010 15 8.43 2010-04-09 2010-04-15 9.08 -0.651
## 2 2010 31 14.1 2010-07-30 2010-08-05 15.8 -1.69
## 3 2010 15 8.20 2010-04-09 2010-04-15 8.94 -0.736
## 4 2010 31 14.0 2010-07-30 2010-08-05 15.6 -1.61
## 5 2010 15 8.27 2010-04-09 2010-04-15 8.96 -0.690
## 6 2010 31 14.1 2010-07-30 2010-08-05 15.7 -1.58
## # ℹ 1 more variable: geometry <POINT [°]>
Anomaly data can also be plotted using built in generic functions in pacea.
# plotting the subsetted data will be quicker
plot(anom_sst_sub)
The plotting default is for the most recent time period in the data available. We can also specify the weeks and years to plot as a facet plot. If using the ‘oisst_month’ data, use the months.plot argument.
plot(anom_sst_sub, weeks.plot = c(15, lubridate::week(as.Date("2010-08-01"))), years.plot = c(2010, 2019))
We have also incorporated the ability to add climatological data to the plot, which creates contour lines showing the anomaly areas that are above (or below for other variables) the 90th and 99th percentile of data (99th percentile contours may not appear if data values do not exceed that value).
# there may be warnings that arise from the interpolation of the contour lines
plot(anom_sst_sub, weeks.plot = c(15, lubridate::week(as.Date("2010-08-01"))), years.plot = c(2010, 2019), clim.dat = clim_sst_sub)
Huang, B., Liu, C., Banzon, V., Freeman, E., Graham, G., Hankins, B., Smith, T., Zhang, H. M., 2021. Improvements of the daily optimum interpolation Sea Surface temperature (DOISST) version 2.1. J. Clim. 34 (8), 2923–2939.