Skip to contents

These functions get all survey set or sample data for a set of species by major area, activity, or specific surveys. The main functions in this package focus on retrieving the more commonly used typs of data and are often limited to sets and samples that conform to current design-based standards and survey grids. These functions will retrieve everything and therefore require careful consideration of what data types are reasonable to include depending on the purpose. For this reason these function return a lot of columns, although the exact number depends on which types of surveys are being returned.

Usage

get_all_survey_samples(
  species,
  ssid = NULL,
  major = NULL,
  usability = NULL,
  unsorted_only = FALSE,
  random_only = FALSE,
  grouping_only = FALSE,
  include_event_info = FALSE,
  include_activity_matches = FALSE,
  remove_bad_data = TRUE,
  remove_duplicates = TRUE,
  return_dna_info = FALSE,
  drop_na_columns = TRUE,
  quiet_option = "message"
)

get_all_survey_sets(
  species,
  ssid = NULL,
  major = NULL,
  years = NULL,
  join_sample_ids = FALSE,
  remove_false_zeros = TRUE,
  remove_bad_data = TRUE,
  remove_duplicates = TRUE,
  include_activity_matches = FALSE,
  usability = NULL,
  grouping_only = FALSE,
  drop_na_columns = TRUE,
  quiet_option = "message"
)

Arguments

species

One or more species common names (e.g. "pacific ocean perch") or one or more species codes (e.g. 396). Species codes can be specified as numeric vectors c(396, 442) or characters c("396", "442"). Numeric values shorter than 3 digits will be expanded to 3 digits and converted to character objects (1 turns into "001"). Species common names and species codes should not be mixed. If any element is missing a species code, then all elements will be assumed to be species common names. Does not work with non-numeric species codes, so in those cases the common name will be needed.

ssid

A numeric vector of survey series IDs. Run get_ssids() for a look-up table of available survey series IDs with surveys series descriptions. Default is to return all data from all surveys. Some of the most useful ids include: contemporary trawl (1, 3, 4, 16), historic trawl (2), IPHC (14), sablefish (35), and HBLL (22, 36, 39, 40).

major

Character string (or vector) of major stat area code(s) to include (characters). Use get_major_areas() to lookup area codes with descriptions. Default is NULL.

usability

A vector of usability codes to include. Defaults to NULL, but typical set for a design-based trawl survey index is c(0, 1, 2, 6). IPHC codes may be different to other surveys and the modern Sablefish survey doesn't seem to assign usabilities.

unsorted_only

Defaults to FALSE, which will return all specimens collected on research trips. TRUE returns only unsorted (1) and NA specimens for both species_category_code and sample_source_code.

random_only

Defaults to FALSE, which will return all specimens collected on research trips. TRUE returns only randomly sampled specimens (sample_type_code = 1, 2, 6, 7, or 8).

grouping_only

Defaults to FALSE, which will return all specimens or sets collected on research trips. TRUE returns only sets or specimens from fishing events with grouping codes that match that expected for a survey. Can also be achieved by filtering for specimens where !is.na(grouping_code).

include_event_info

Logical for whether to append all relevant fishing event info (location, timing, effort, catch, etc.). Defaults to TRUE.

include_activity_matches

Get all surveys with activity codes that match chosen ssids.

remove_bad_data

Remove known bad data, such as unrealistic length or weight values and duplications due to trips that include multiple surveys. Default is TRUE.

remove_duplicates

Logical for whether to remove duplicated event records due to overlapping survey stratifications when original_ind = 'N'. Default is FALSE. This option only remains possible when ssids are supplied and activity matches aren't included. Otherwise turns on automatically.

return_dna_info

Should DNA container ids and sample type be returned? This can create duplication of specimen ids for some species. Defaults to FALSE.

drop_na_columns

Logical for removing all columns that only contain NAs. Defaults to TRUE.

quiet_option

Default option, "message", suppresses messages from sections of code with lots of join_by messages. Any other string will allow messages.

years

Default is NULL, which returns all years.

join_sample_ids

This option was problematic, so now reverts to FALSE.

remove_false_zeros

Default of TRUE will make sure weights > 0 don't have associated counts of 0 and vice versa. Mostly useful for trawl data where counts are only taken for small catches.

Examples

if (FALSE) { # \dontrun{
## Import survey catch density and location data by tow or set for plotting
## Specify single or multiple species by common name or species code and
## single or multiple survey series id(s).
## Notes:
## `area_km` is the stratum area used in design-based index calculation.
## `area_swept` is in m^2 and is used to calculate density for trawl surveys
## It is based on `area_swept1` (`doorspread_m` x `tow_length_m`) except
## when `tow_length_m` is missing, and then we use `area_swept2`
## (`doorspread` x `duration_min` x `speed_mpm`).
## `duration_min` is derived in the SQL procedure "proc_catmat_2011" and
## differs slightly from the difference between `time_deployed` and
## `time_retrieved`.
} # }