These functions get all survey set or sample data for a set of species by major area, activity, or specific surveys. The main functions in this package focus on retrieving the more commonly used typs of data and are often limited to sets and samples that conform to current design-based standards and survey grids. These functions will retrieve everything and therefore require careful consideration of what data types are reasonable to include depending on the purpose. For this reason these function return a lot of columns, although the exact number depends on which types of surveys are being returned.
Usage
get_all_survey_samples(
species,
ssid = NULL,
major = NULL,
usability = NULL,
unsorted_only = FALSE,
random_only = FALSE,
grouping_only = FALSE,
include_event_info = FALSE,
include_activity_matches = FALSE,
remove_bad_data = TRUE,
remove_duplicates = TRUE,
return_dna_info = FALSE,
drop_na_columns = TRUE,
quiet_option = "message"
)
get_all_survey_sets(
species,
ssid = NULL,
major = NULL,
years = NULL,
join_sample_ids = FALSE,
remove_false_zeros = TRUE,
remove_bad_data = TRUE,
remove_duplicates = TRUE,
include_activity_matches = FALSE,
usability = NULL,
grouping_only = FALSE,
drop_na_columns = TRUE,
quiet_option = "message"
)
Arguments
- species
One or more species common names (e.g. "pacific ocean perch") or one or more species codes (e.g.
396
). Species codes can be specified as numeric vectorsc(396, 442
) or charactersc("396", "442")
. Numeric values shorter than 3 digits will be expanded to 3 digits and converted to character objects (1
turns into"001"
). Species common names and species codes should not be mixed. If any element is missing a species code, then all elements will be assumed to be species common names. Does not work with non-numeric species codes, so in those cases the common name will be needed.- ssid
A numeric vector of survey series IDs. Run
get_ssids()
for a look-up table of available survey series IDs with surveys series descriptions. Default is to return all data from all surveys. Some of the most useful ids include: contemporary trawl (1, 3, 4, 16), historic trawl (2), IPHC (14), sablefish (35), and HBLL (22, 36, 39, 40).- major
Character string (or vector) of major stat area code(s) to include (characters). Use get_major_areas() to lookup area codes with descriptions. Default is NULL.
- usability
A vector of usability codes to include. Defaults to NULL, but typical set for a design-based trawl survey index is
c(0, 1, 2, 6)
. IPHC codes may be different to other surveys and the modern Sablefish survey doesn't seem to assign usabilities.- unsorted_only
Defaults to FALSE, which will return all specimens collected on research trips. TRUE returns only unsorted (
1
) andNA
specimens for bothspecies_category_code
andsample_source_code
.- random_only
Defaults to FALSE, which will return all specimens collected on research trips. TRUE returns only randomly sampled specimens (
sample_type_code
=1, 2, 6, 7, or 8
).- grouping_only
Defaults to FALSE, which will return all specimens or sets collected on research trips. TRUE returns only sets or specimens from fishing events with grouping codes that match that expected for a survey. Can also be achieved by filtering for specimens where
!is.na(grouping_code)
.- include_event_info
Logical for whether to append all relevant fishing event info (location, timing, effort, catch, etc.). Defaults to TRUE.
- include_activity_matches
Get all surveys with activity codes that match chosen ssids.
- remove_bad_data
Remove known bad data, such as unrealistic length or weight values and duplications due to trips that include multiple surveys. Default is TRUE.
- remove_duplicates
Logical for whether to remove duplicated event records due to overlapping survey stratifications when original_ind = 'N'. Default is FALSE. This option only remains possible when ssids are supplied and activity matches aren't included. Otherwise turns on automatically.
- return_dna_info
Should DNA container ids and sample type be returned? This can create duplication of specimen ids for some species. Defaults to FALSE.
- drop_na_columns
Logical for removing all columns that only contain NAs. Defaults to TRUE.
- quiet_option
Default option,
"message"
, suppresses messages from sections of code with lots ofjoin_by
messages. Any other string will allow messages.- years
Default is NULL, which returns all years.
- join_sample_ids
This option was problematic, so now reverts to FALSE.
- remove_false_zeros
Default of
TRUE
will make sure weights > 0 don't have associated counts of 0 and vice versa. Mostly useful for trawl data where counts are only taken for small catches.
Examples
if (FALSE) { # \dontrun{
## Import survey catch density and location data by tow or set for plotting
## Specify single or multiple species by common name or species code and
## single or multiple survey series id(s).
## Notes:
## `area_km` is the stratum area used in design-based index calculation.
## `area_swept` is in m^2 and is used to calculate density for trawl surveys
## It is based on `area_swept1` (`doorspread_m` x `tow_length_m`) except
## when `tow_length_m` is missing, and then we use `area_swept2`
## (`doorspread` x `duration_min` x `speed_mpm`).
## `duration_min` is derived in the SQL procedure "proc_catmat_2011" and
## differs slightly from the difference between `time_deployed` and
## `time_retrieved`.
} # }