Estimation of Larval Sea Lamprey Abundance from Deepwater Electrofisher Surveys

Jean V. Adams

2021-08-02

The R package GLFC includes functions to estimate larval sea lamprey abundance in the St. Marys River using the Deep-Water ElectroFisher Estimation System (DWEFES).

Install

Install the latest version of R from CRAN.

Install the latest version of the GLFC package:

install.packages("remotes")
remotes::install_github("JVAdams/GLFC")

Prepare the data

Three *.xlsx files with data on larval sea lamprey catches, lengths, and plot treatments in the most recent year are needed to generate larval sea lamprey estimates.

(1) Catch - An * *.xl* * file with the catch data. The catch file should have at least the following 19 columns, named in the header row:

The last three columns are typically added after sampling, in ArcGIS Pro. Missing values are entered as -9999.

FID_1 ID X Y SAMPID LATITUDE LONGITUDE STIME BOAT SAMPLE DEPTH SUB_MAJOR SUB_MINOR1 SUB_MINOR2 GPSDATE HAB_TYPE SL_TOTAL AB_TOTAL I_TOTAL COMMENT INBPLOT REGION NEW_NUMB
357 373 -84.29894 46.49385 201970373 46.49385 -84.29894 15:15 7 2 15 5 4 -9999 29/07/2019 2 0 0 0 No comment 1 2 44
358 374 -84.29892 46.49481 201970374 46.49481 -84.29892 15:29 7 2 12 4 3 -9999 29/07/2019 3 0 0 0 No comment 1 2 44
359 375 -84.30016 46.49483 201970375 46.49483 -84.30016 15:37 7 2 19 3 2 -9999 29/07/2019 3 0 0 0 No comment 1 2 44
360 376 -84.30018 46.49578 201970376 46.49578 -84.30018 15:46 7 2 19 6 8 5 29/07/2019 1 0 0 0 No comment 1 2 44
349 365 -84.29099 46.49377 201970365 46.49377 -84.29099 13:45 7 2 22 6 5 8 29/07/2019 2 0 0 0 No comment 1 2 8

(2) Lengths - One or more *.xl* files with lengths data. The files should have at least the following 2 columns named in the header row:

SAMPID ID LENGTH CLASS
201940001 1 138 A
201940001 2 81 A
201940001 3 46 A
201940003 1 68 A
201940004 1 56 A
201940004 2 55 A

(3) Plots - An *.xl* file with information on each plot. The files should have at least the following 3 columns, named in the header row:

AREA Plot_09 Treat_2019
15.780394 50 0
11.301904 51 0
4.329009 27 0
3.536266 3 0
15.040621 22 1
17.081183 112 1

For all 3 files,

All 3 files should be stored in a single directory, directory.

Historic data

Four *.csv files with data on historic catches, lengths, plot treatments, and population estimates will be updated with new information.

(1) Catch - A *.csv file with historic catch data. The required fields are the same as for the catch *.xlsx file, with the following exceptions:

(2) Lengths - A *.csv file with historic lengths data. The required fields are the same as for the lengths *.xlsx file, with the following exceptions:

(3) Plots - A *.csv file with historic plot-specific abundance estimates with the following 15 columns, named in the header row:

(4) PEs - A *.csv file with historic population estimates with the following 14 columns, named in the header row:

All 4 files should be stored in the same directory as the other files, directory.

Read in the data

Load the GLFC package to have access to the DWEFES functions.

library(GLFC)

Assign the name of the directory (using forward slashes, /, in the file path) and the names of the three new and four historic data files. For example:

directory   <- "C:/temp"
catch.file.name <- "2019_Base_Data.xlsx"
length.file.name <- c("2019_LENGTHS_4.xlsx", "2019_LENGTHS_7.xlsx")
plots.file.name <- "2019_Treat_Plots.xlsx"
catch.all <- "2018-CatchALL.csv"
length.all <- "2018-LengthALL.csv"
plot.all <- "2018-BPlotEstsALL.csv"
whole.river.pe <- "2018-WholeRiverEsts.csv"

Use the DWEFprep() function to read in the deep-water electrofishing data (including information on the lamprey catch, the lamprey lengths, and the identification of plots that were treated) and prepare them for estimation. In addition to providing the directory and file names as arguments to this function, you also need to provide information on treatment and survey timing.

Argument TRTtiming is a character scalar identifying the timing of the assessment survey relative to treatment. It should take on one of four values:

Argument b4plots is a numeric vector identifying the plots that were surveyed before they were treated. This is rarely needed (default NULL). A value for this should only be provided if TRTtiming is set to “MIXED”. For example, if there were three plots surveyed before treatment, b4plots = c(18, 39, 112).

The output from the DWEFprep() function is a list with recent catch, length, and plot data in three data frames (CAT, LEN, PLT), historic catch, length, plot, and population estimate data in four data frames (CAThist, LENhist, Plothist, PEhist), and a character vector of the directory and file names (SOURCE). The recent plot data is reorganized to have only one row per plot, with the treated variable indicating the number of treatments each plot received that year.

mydat <- DWEFprep(Dir=directory, CatchFile=catch.file.name,
  LengthsFile=length.file.name, PlotsFile=plots.file.name,
  TRTtiming="AFTER", CatchHist=catch.all, LengthHist=length.all,
  PlotHist=plot.all, PEHist=whole.river.pe, b4plots=NULL)
lapply(mydat, head, 2)
## $CAT
##    year mm dd stime period    sampid transamp transect site boat latitude
## 83 2019  6 11  1358      1 201940082       NA       NA   82    4 46.45540
## 65 2019  6 14  1049      1 201940064       NA       NA   64    4 46.44406
##    longitude region label inbplot plot.num new.numb sample cluster depth
## 83 -84.26958      2    NA       0       NA        0      2      NA     5
## 65 -84.24664      2    NA       0       NA        0      2      NA    23
##    hab.type sub.major sub.minor1 sub.minor2 ab.total i.total sl.total
## 83        1         6          8      -9999        0       0        0
## 65        1         6          8      -9999        0       0        0
##       comment commentwrap hr mn dec.time       date
## 83 No comment  No comment 13 58 13.96667 2019-06-11
## 65 No comment  No comment 10 49 10.81667 2019-06-14
## 
## $LEN
## # A tibble: 2 x 7
##      sampid class length sl.larv.n sl.larv.adj sl.meta.n sl.meta.adj
##       <dbl> <chr>  <dbl>     <dbl>       <dbl>     <dbl>       <dbl>
## 1 201940001 A        138         1        5.17         0           0
## 2 201940001 A         81         1        2.13         0           0
## 
## $PLT
##    area.ha new.numb trtd
## 1 2.291365        1    0
## 2 2.816693        2    1
## 
## $CAThist
##   year mm dd stime period sampid transamp transect site boat latitude longitude
## 1 1993 NA NA    NA      0      1       NA       18   78    1  46.4512 -84.26781
## 2 1993 NA NA    NA      0      2       NA       18  103    1  46.4576 -84.24868
##   region label inbplot plot.num new.numb sample cluster depth hab.type
## 1      2             1        4     4001      2      NA     6        1
## 2      2             0       NA       NA      2      NA     7        1
##   sub.major sub.minor1 sub.minor2 ab.total i.total sl.larv.n sl.larv.adj
## 1         7          5          6        0       0         0           0
## 2         5          7          6        0       0         0           0
##   comment
## 1        
## 2        
## 
## $LENhist
##   year sampid length sl.larv.adj
## 1 1993     30     61    1.715267
## 2 1993     30     77    2.031795
## 
## $Plothist
##   year period new.numb  meanlat  meanlong  area.ha n.samp catch meannperha
## 1 2011      1       44 46.49475 -84.29866 17.92037      8     1   3739.271
## 2 2011      1       22 46.50343 -84.32667 15.04062      9     2   2911.472
##     sd.dens   larvpe     ptran    tranpe pbig    bigpe
## 1 10576.256 67009.11 0.9776448 65511.110    1 67009.11
## 2  8734.416 43790.34 0.1289114  5645.076    1 43790.34
## 
## $PEhist
##   Year Design  Type Period  Trt      Dates Samples Catch  Area_ha      PE
## 1 1999  strat whole     -1  pre 6/10 - 7/8     867   309 7836.505 5081628
## 2 1999  strat whole      1 post 8/5 - 9/28     863   107 7836.505 1381080
##          SD        LO      HI        CV
## 1 1480251.0 3116023.8 9144324 0.2912946
## 2  347872.8  831621.7 2147787 0.2518846
## 
## $SOURCE
##                   Dir             CatchFile 
##             "C:/temp" "2019_Base_Data.xlsx"

Error check the data

Use the DWEFerror() function to error check the deep-water electrofishing data (including information on the lamprey catch, the lamprey lengths, and the identification of plots that were treated) prior to estimation.

In addition to providing output from the DWEFprep() function as arguments to this function, you also need to indicate whether you want to create a stand alone error report (Continue=FALSE) or if you want to start a report that will left open for the estimates to be added in the next step (Continue=TRUE).

The output from the DWEFerror() function is a list with cleaned (errors removed) DWEF catch and lengths in two data frames (CAT2, LEN2), a character vector of the table references for any remaining errors (ERR), a character vector of the directory and file names (SOURCE), and a character vector of the output file names (OUT).

If Continue=FALSE, a rich text file will be saved to directory with error checking text, tables, and figures. If Continue=TRUE, the same rich text file will be started, but left open, typically to add in more text, tables, and figures generated by the DWEFreport() function.

The generated report is an *.rtf (rich text format) file but it has a *.doc extension so that it will be automatically opened by Microsoft Word. The report is named YYYY DWEFES Report dd-Mon-YYYY.doc, where YYYY is the latest year represented in the input data, and placed in directory.

The following conditions and graphs are used to highlight potential errors in the data:

myclean <- DWEFerror(Dir=mydat$SOURCE["Dir"], Catch=mydat$CAT,
  Lengths=mydat$LEN, Source=mydat$SOURCE, Continue=TRUE)
## New RTF document created, C:/temp/2019 DWEFES Report 2021-08-02.doc
lapply(myclean, head, 2)
## $CAT2
##    year mm dd stime period    sampid transamp transect site boat latitude
## 83 2019  6 11  1358      1 201940082       NA       NA   82    4 46.45540
## 65 2019  6 14  1049      1 201940064       NA       NA   64    4 46.44406
##    longitude region label inbplot plot.num new.numb sample cluster depth
## 83 -84.26958      2    NA       0       NA        0      2      NA     5
## 65 -84.24664      2    NA       0       NA        0      2      NA    23
##    hab.type sub.major sub.minor1 sub.minor2 ab.total i.total sl.total
## 83        1         6          8         NA        0       0        0
## 65        1         6          8         NA        0       0        0
##       comment commentwrap hr mn dec.time       date sl.larv.n sl.larv.adj
## 83 No comment  No comment 13 58 13.96667 2019-06-11         0           0
## 65 No comment  No comment 10 49 10.81667 2019-06-14         0           0
##    sl.meta.n sl.meta.adj sl.adjctch
## 83         0           0          0
## 65         0           0          0
## 
## $LEN2
## # A tibble: 2 x 8
##      sampid class length sl.larv.n sl.larv.adj sl.meta.n sl.meta.adj  len5
##       <dbl> <chr>  <dbl>     <dbl>       <dbl>     <dbl>       <dbl> <dbl>
## 1 201940001 A        138         1        5.17         0           0   140
## 2 201940001 A         81         1        2.13         0           0    85
## 
## $ERR
## NULL
## 
## $SOURCE
##                   Dir             CatchFile 
##             "C:/temp" "2019_Base_Data.xlsx" 
## 
## $OUT
## [1] "2019-CatchesSMR.csv" "2019-LengthsSMR.csv"

Estimate larval abundance

Use the DWEFreport() function to generate estimates of larval sea lamprey abundance from the deep-water electrofishing data.

In addition to providing output from the DWEFprep() and DWEFerror() functions as arguments to this function, you also need to (1) indicate whether the downstream portion of the St. Marys River was surveyed (Downstream=TRUE) or if only the upstream portion of the river was surveyed (Downstream=FALSE) and (2) provide a data frame of stratum areas, StratArea, with three variables:

SMRStratArea
##   inbplot region haStrat
## 1       0      1  846.86
## 2       0      2 1000.57
## 3       0      3 3366.97
## 4       0      4 1807.39
## 5       0      5  203.78
## 6       1      1  140.99
## 7       1      2  475.95
## 8       1      3  248.08
## 9       1      5   56.65

It is assumed that this function will be run immediately after the DWEFerror() function, in which case the *.rtf file created by DWEFerror() will be continued and completed by DWEFreport().

The report has a few paragraphs summarizing the latest estimates, along with 3 tables:

and 3 figures:

In addition pre- and post-treatment whole-river estimates are printed to the screen (along with any relevant messages regarding the estimation process). And three *.csv files are written to directory, with the final catch (YYYYCatchesSMR.csv), lengths (YYYYLengthsSMR.csv), and plot data (YYYYBPlotEstsSMR.csv).

DWEFreport(Dir=mydat$SOURCE["Dir"], CatchClean=myclean$CAT2,
  LengthsClean=myclean$LEN2, Plots=mydat$PLT, CatHist=mydat$CAThist,
  LenHist=mydat$LENhist, PlotHist=mydat$Plothist, PEHist=mydat$PEhist,
  Downstream=FALSE, Errors=myclean$ERR, Outfiles=myclean$OUT, 
  StratArea=SMRStratArea)
## Okay.  We will have to apply an expansion factor to the survey data  to get a whole river estimate of abundance.
## 
## 
##     Pre-treatment whole-river estimate
## 
## $PE
## [1] "1,921,093"
## 
## $PE.sd
## [1] "329,338.9"
## 
## $PE.cv
## [1] "0.1714331"
## 
## $PE.ci
## [1] "1,275,589" "2,566,597"
## 
## 
##     Post-treatment whole-river estimate
## 
## $PE
## [1] "1,142,849"
## 
## $PE.sd
## [1] "247,714.7"
## 
## $PE.cv
## [1] "0.2167519"
## 
## $PE.ci
## [1] "  657,328.1" "1,628,369.9"

Summary

Below is the pared down version of the R code above needed to estimate larval sea lamprey abundance in the St. Marys River.

library(GLFC)

# read in the new data
directory   <- "C:/temp"
catch.file.name <- "2019_Base_Data.xlsx"
length.file.name <- c("2019_LENGTHS_4.xlsx", "2019_LENGTHS_7.xlsx")
plots.file.name <- "2019_Treat_Plots.xlsx"

# read in the historic data
catch.all <- "2018-CatchALL.csv"
length.all <- "2018-LengthALL.csv"
plot.all <- "2018-BPlotEstsALL.csv"
whole.river.pe <- "2018-WholeRiverEsts.csv"

mydat <- DWEFprep(Dir=directory, CatchFile=catch.file.name,
  LengthsFile=length.file.name, PlotsFile=plots.file.name,
  CatchHist=catch.all, LengthHist=length.all,
  PlotHist=plot.all, PEHist=whole.river.pe)

# error check the data
myclean <- DWEFerror(Dir=mydat$SOURCE["Dir"], Catch=mydat$CAT,
  Lengths=mydat$LEN, Source=mydat$SOURCE, Continue=TRUE)

# estimate larval abundance
DWEFreport(Dir=mydat$SOURCE["Dir"], CatchClean=myclean$CAT2,
  LengthsClean=myclean$LEN2, Plots=mydat$PLT, CatHist=mydat$CAThist,
  LenHist=mydat$LENhist, PlotHist=mydat$Plothist, PEHist=mydat$PEhist,
  Downstream=FALSE, Errors=myclean$ERR, Outfiles=myclean$OUT)