unmarked
R/format-unmarked-occu.r
format_unmarked_occu.Rd
Prepare a data frame of species observations for ingestion into the package
unmarked
for hierarchical modeling of abundance and occurrence. The
function unmarked::formatWide()
takes a data frame and converts it to one
of several unmarked
objects, which can then be used for modeling. This
function converts data from a format in which each row is an observation
(e.g. as in the eBird Basic Dataset) to the esoteric format required by
unmarked::formatWide()
in which each row is a site.
format_unmarked_occu( x, site_id = "site", response = "species_observed", site_covs, obs_covs )
x |
|
---|---|
site_id | character; a unique idenitifer for each "site", typically
identifying observations from a unique location by the same observer
within a period of temporal closure. Data output from
|
response | character; the variable that will act as the response in modeling efforts, typically a binary variable indicating presence or absence or a count of individuals seen. |
site_covs | character; the variables that will act as site-level
covariates, i.e. covariates that vary at the site level, for example,
latitude/longitude or habitat predictors. If this parameter is missing, it
will be assumed that any variable that is not an observation-level
covariate ( |
obs_covs | character; the variables that will act as observation-level covariates, i.e. covariates that vary within sites, at the level of observations, for example, time or length of observation. |
A data frame that can be processed by unmarked::formatWide()
.
Each row will correspond to a unqiue site and, assuming there are a maximum
of N
observations per site, columns will be as follows:
The unique site identifier, named "site".
N
response columns, one for each observation, named "y.1", ..., "y.N".
Columns for each of the site-level covariates.
Groups of N
columns of observation-level covariates, one column per
covariate per observation, names "covariate_name.1", ...,
"covariate_name.N".
Hierarchical modeling requires repeat observations at each "site" to
estimate detectability. A "site" is typically defined as a geographic
location visited by the same observer within a period of temporal closure.
To define these sites and filter out observations that do not correspond to
repeat visits, users should use filter_repeat_visits()
, then pass the
output to this function.
format_unmarked_occu()
is designed to prepare data to be converted into
an unmarkedFrameOccu
object for occupancy modeling with
unmarked::occu()
; however, it can also be used to prepare data for
conversion to an unmarkedFramePCount
object for abundance modeling with
unmarked::pcount()
.
Other modeling:
filter_repeat_visits()
# read and zero-fill the ebd data f_ebd <- system.file("extdata/zerofill-ex_ebd.txt", package = "auk") f_smpl <- system.file("extdata/zerofill-ex_sampling.txt", package = "auk") # data must be for a single species ebd_zf <- auk_zerofill(x = f_ebd, sampling_events = f_smpl, species = "Collared Kingfisher", collapse = TRUE) occ <- filter_repeat_visits(ebd_zf, n_days = 30) # format for unmarked # typically one would join in habitat covariates prior to this step occ_wide <- format_unmarked_occu(occ, response = "species_observed", site_covs = c("latitude", "longitude"), obs_covs = c("effort_distance_km", "duration_minutes")) # create an unmarked object if (requireNamespace("unmarked", quietly = TRUE)) { occ_um <- unmarked::formatWide(occ_wide, type = "unmarkedFrameOccu") unmarked::summary(occ_um) }#> unmarkedFrame Object #> #> 47 sites #> Maximum number of observations per site: 10 #> Mean number of observations per site: 3.85 #> Sites with at least one detection: 25 #> #> Tabulation of y observations: #> FALSE TRUE <NA> #> 129 52 289 #> #> Site-level covariates: #> latitude longitude #> Min. :1.206 Min. :103.7 #> 1st Qu.:1.315 1st Qu.:103.7 #> Median :1.351 Median :103.8 #> Mean :1.348 Mean :103.8 #> 3rd Qu.:1.369 3rd Qu.:103.8 #> Max. :1.446 Max. :104.0 #> #> Observation-level covariates: #> effort_distance_km duration_minutes #> Min. : 0.100 Min. : 1.00 #> 1st Qu.: 0.200 1st Qu.: 15.00 #> Median : 1.000 Median : 30.00 #> Mean : 1.579 Mean : 66.49 #> 3rd Qu.: 2.000 3rd Qu.: 86.25 #> Max. :10.000 Max. :480.00 #> NA's :393 NA's :392# this function can also be used for abundance modeling abd <- ebd_zf %>% # convert count to integer, drop records with no count dplyr::mutate(observation_count = as.integer(observation_count)) %>% dplyr::filter(!is.na(observation_count)) %>% # filter to repeated visits filter_repeat_visits(n_days = 30)#> Warning: Problem with `mutate()` input `observation_count`. #> ℹ NAs introduced by coercion #> ℹ Input `observation_count` is `as.integer(observation_count)`.#> Warning: NAs introduced by coercion# prepare for conversion to unmarkedFramePCount object abd_wide <- format_unmarked_occu(abd, response = "observation_count", site_covs = c("latitude", "longitude"), obs_covs = c("effort_distance_km", "duration_minutes")) # create an unmarked object if (requireNamespace("unmarked", quietly = TRUE)) { abd_um <- unmarked::formatWide(abd_wide, type = "unmarkedFrameOccu") unmarked::summary(abd_um) }#> unmarkedFrame Object #> #> 44 sites #> Maximum number of observations per site: 10 #> Mean number of observations per site: 3.89 #> Sites with at least one detection: 21 #> #> Tabulation of y observations: #> 0 1 2 3 5 10 <NA> #> 125 27 8 2 7 2 269 #> #> Site-level covariates: #> latitude longitude #> Min. :1.206 Min. :103.7 #> 1st Qu.:1.315 1st Qu.:103.7 #> Median :1.349 Median :103.8 #> Mean :1.347 Mean :103.8 #> 3rd Qu.:1.364 3rd Qu.:103.8 #> Max. :1.446 Max. :104.0 #> #> Observation-level covariates: #> effort_distance_km duration_minutes #> Min. : 0.100 Min. : 1.0 #> 1st Qu.: 0.200 1st Qu.: 15.0 #> Median : 1.000 Median : 30.0 #> Mean : 1.587 Mean : 68.3 #> 3rd Qu.: 2.000 3rd Qu.: 90.0 #> Max. :10.000 Max. :480.0 #> NA's :362 NA's :361