Spatial data from IUCN and Aquamaps is combined with extinction risk information from IUCN to generate regional scores for the Species subgoal. A region’s status is based upon an area-weighted average of species health across each global reporting region.
From Halpern et al (2012):
The target for the Species sub-goal is to have all species at a risk status of Least Concern. We scaled the lower end of the biodiversity goal to be 0 when 75% species are extinct, a level comparable to the five documented mass extinctions and would constitute a catastrophic loss of biodiversity. The Status of assessed species was calculated as the area- and threat status-weighted average of the number of threatened species within each 0.5 degree grid cell.
Mean risk status per cell:
\[\bar{R}_{cell} = \frac{\displaystyle\sum_{species}(Risk)}{n_{spp}}\]
Mean risk status per region:
\[\bar{R}_{SPP} = \frac{\displaystyle\sum_{cells}(\bar{R}_{cell} * A_{cell} * pA_{cell-rgn})}{A_{rgn}}\]
Species goal model
\[X_{SPP} = \frac{((1 - \bar{R}_{SPP}) - 0.25)}{(1 - 0.25)} * 100%\]
where:
Changes since 2015 SPP subgoal for global OHI:
taxize
package to identify and match synonyms for better confidence in matching species.IUCN:
AquaMaps:
AquaMaps data for the 2016 assessment was provided as .sql files, as in previous years, that can be used to generate an SQL database. Each line in the .sql is a command to populate the SQL database.
aquamaps_2015_full_dataset_ohi.zip
:
hcaf_ohi.sql
speciesoccursum_ohi.sql
hcaf_species_native_ohi.sql
hcaf_truncated.csv
speciesoccursum.csv
hcaf_sp_native_trunc.csv
To extract data, we instead scan each line for CREATE TABLE
and INSERT INTO
commands to create and save dataframes. Note that the am_extract_2015.R
script discards much of the data from these .sqls that is not used within the OHI Species Goal processing (thus “truncated”). This speeds up read time and processing time and avoids parsing issues with some of the rows/columns. Note also that the hcaf_species_native_ohi.sql file does not originally contain LOICZID information; this is added in the extract script, since LOICZID as a cell identifier is faster and less memory intensive than CsquareCode (integer vs character string).
The spp_ico/R/am_extract_2015.R
script performs these operations. This can be a time consuming process, so typically this code chunk is run once and then set to eval = FALSE
once the outputs have been generated.
reload <- FALSE
source(file.path(dir_git, 'R/am_extract_2015.R'))
csquarecode | loiczid | nlimit | slimit | wlimit | elimit | centerlat | centerlong | cellarea | oceanarea |
---|---|---|---|---|---|---|---|---|---|
5207:363:1 | 167254 | -26.0 | -26.5 | -73.5 | -73.0 | -26.25 | -73.25 | 2772.29 | 2772.29 |
5207:363:2 | 167253 | -26.0 | -26.5 | -74.0 | -73.5 | -26.25 | -73.75 | 2772.29 | 2772.29 |
5207:363:3 | 167974 | -26.5 | -27.0 | -73.5 | -73.0 | -26.75 | -73.25 | 2760.25 | 2760.25 |
5207:363:4 | 167973 | -26.5 | -27.0 | -74.0 | -73.5 | -26.75 | -73.75 | 2760.25 | 2760.25 |
5207:360:1 | 167260 | -26.0 | -26.5 | -70.5 | -70.0 | -26.25 | -70.25 | 2772.29 | 0.00 |
speciesid | reviewed | speccode | genus | species | fbname | occurcells | kingdom | phylum | class | order | family | iucn_id | iucn_code | iucn_version |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Fis-156671 | null | 62612 | Abalistes | filamentosus | null | 12 | Animalia | Chordata | Actinopterygii | Tetraodontiformes | Balistidae | null | N.E. | 2015-2 |
Fis-53544 | 1 | 9 | Abalistes | stellaris | Starry triggerfish | 198 | Animalia | Chordata | Actinopterygii | Tetraodontiformes | Balistidae | null | N.E. | 2015-2 |
Fis-142700 | 1 | 58334 | Abalistes | stellatus | null | 235 | Animalia | Chordata | Actinopterygii | Tetraodontiformes | Balistidae | null | N.E. | 2015-2 |
Fis-27725 | 1 | 10232 | Ablabys | taenianotus | Cockatoo waspfish | 59 | Animalia | Chordata | Actinopterygii | Scorpaeniformes | Tetrarogidae | null | N.E. | 2015-2 |
Fis-22975 | 1 | 972 | Ablennes | hians | Flat needlefish | 397 | Animalia | Chordata | Actinopterygii | Beloniformes | Belonidae | null | N.E. | 2015-2 |
speciesid | probability | loiczid |
---|---|---|
Fis-29358 | 1.00 | 129241 |
Fis-139729 | 0.97 | 129241 |
Fis-23185 | 0.81 | 129241 |
Fis-29263 | 1.00 | 129241 |
Fis-29290 | 1.00 | 129241 |
To identify appropriate IUCN species for the analysis, we identified all IUCN Red List species whose habitat included “marine” designation. The ingest_iucn.R
script scrapes this data directly from the IUCN Red List website.
Processed files, saved to git-annex/globalprep/spp_ico/v201x/int
: * spp_iucn_all.csv
- full list of IUCN species pulled from web, some cleaning. * spp_iucn_habitats.csv
- list of IUCN species (by iucn_sid) and corresponding habitat. * spp_iucn_marine.csv
- prepped list: cleaned marine list with subpops and trends.
The spp_ico/R/ingest_iucn.R
script performs these functions. This can be a time consuming process, so typically this code chunk is run once and then set to eval = FALSE
once the outputs have been generated.
reload <- FALSE
source(file.path(dir_git, 'R/ingest_iucn.R'))
sciname | class | order | family | genus | species | authority | iucn_sid | modified_year | category | criteria | habitat | popn_trend | subpop_sid | parent_sid |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Abantennarius analis | ACTINOPTERYGII | LOPHIIFORMES | ANTENNARIIDAE | Abantennarius | analis | (Schultz, 1957) | 155277 | 2010 | LC | NA | Marine | Unknown | NA | NA |
Ablennes hians | ACTINOPTERYGII | BELONIFORMES | BELONIDAE | Ablennes | hians | (Valenciennes, 1846) | 13486514 | 2015 | LC | NA | Marine | Unknown | NA | NA |
Ablennes pacificus | ACTINOPTERYGII | BELONIFORMES | BELONIDAE | Ablennes | pacificus | (Valenciennes, 1846) | 13486514 | 2015 | LC | NA | Marine | Unknown | NA | NA |
Aboma etheostoma | ACTINOPTERYGII | PERCIFORMES | GOBIIDAE | Aboma | etheostoma | Jordan & Starks, 1895 | 183435 | 2010 | DD | NA | Marine | Unknown | NA | NA |
Aboma snyderi | ACTINOPTERYGII | PERCIFORMES | GOBIIDAE | Aboma | snyderi | (Temminck & Schlegel, 1845) | 181137 | 2012 | LC | NA | Marine | Unknown | NA | NA |
Having processed AquaMaps and IUCN species raw data, we can now prepare a full combined list of all species to be included in the OHI SPP goal. The function create_spp_master_lookup()
creates the full lookup table:
speciesoccursum.csv
sciname
field; the verify_scinames()
function uses taxize::gnr_resolve()
to compare AM scinames to accepted names from Encyclopedia of Life and NCBI databases - this helps resolve differences in naming conventions and species aliases.spp_iucn_marine.csv
sciname
field as above.At this point, the list is all AquaMaps species and all marine-identified IUCN species. Next, identify the source of spatial distribution data for each species, if available.
spatial_source
(“am” or “iucn”).Now having identified spatial source availability and preference for all species on the list:
git-annex/globalprep/spp_ico/v2016/int/spp_all_raw.csv
parent_sid
) that do not have a named subpopulation location (i.e. iucn_subpop
field is NA, instead of the name of a subpopulation)git-annex/globalprep/spp_ico/v2016/int/spp_all_cleaned.csv
The spp_ico/v2016/prep_spp_list.R
script performs these functions. This does not take all that long, but the saved output can be used instead instead of reloading the entire process. Note the optional arguments source_pref
and fn_tag
that can be used to customize the run (passed to create_spp_master_lookup()
from spp_fxn.R
); 'iucn'
and ''
are standard defaults.
reload <- FALSE
source_pref <- 'iucn'
fn_tag <- ''
source(file.path(dir_git, scenario, 'prep_spp_list.R'))
am_sid | am_cat | sciname | iucn_sid | pop_trend | pop_cat | spp_group | id_no | iucn_subpop | spatial_source | cat_score | trend_score |
---|---|---|---|---|---|---|---|---|---|---|---|
Fis-26169 | LC | Apolemichthys trimaculatus | 165835 | Stable | LC | ANGELFISH | 165835 | NA | iucn | 0 | 0 |
Fis-28014 | LC | Apolemichthys xanthotis | 165853 | Stable | LC | ANGELFISH | 165853 | NA | iucn | 0 | 0 |
Fis-28015 | LC | Apolemichthys xanthurus | 165844 | Unknown | LC | ANGELFISH | 165844 | NA | iucn | 0 | NA |
Fis-27342 | LC | Centropyge acanthops | 155083 | Stable | LC | ANGELFISH | 155083 | NA | iucn | 0 | 0 |
Fis-22168 | LC | Centropyge argi | 165837 | Unknown | LC | ANGELFISH | 165837 | NA | iucn | 0 | NA |
We extract IUCN polygon presence to the same half-degree cells as AquaMaps to simplify the analysis. * The spp_all
species list includes a field spp_group
that identifies which shapefile contains the spatial information for a given species. * for each species group, specific species are identified by comparing iucn_sid
from dataframe to id_no
within the shapefile. * Extract loiczid cell IDs for each species within each species group. Save a .csv file for that group, with fields: * sciname | iucn_sid | presence | subpop | LOICZID | prop_area * presence codes: 1 extant; 2 prob extant (discontinued); 3 Possibly Extant; 4 Possibly Extinct; 5 Extinct (post 1500); 6 Presence Uncertain * NOTE: this takes a long time - multiple hours for some of the shape files.
* by passing a filtered data frame to the function, you can focus the process only on new or updated shapefiles * reload = FALSE allows the function to skip extraction on groups with files already present. Set to TRUE if you need to extract an updated shapefile (or change the shapefile name, or delete the previous extraction…).
The function extract_loiczid_per_spp()
performs these functions, and is contained in the spp_ico/v2016/spp_fxn.R
script. This can be a time consuming process, so typically this code chunk is run once and then set to eval = FALSE
once the outputs have been generated.
spp_all <- read_csv(file.path(dir_anx, scenario, 'int/spp_all_cleaned.csv'))
### set up maps_list for all standard (non-bird) IUCN species...
maps_list_iucn <- spp_all %>%
filter(str_detect(spatial_source, 'iucn') & !str_detect(spatial_source, 'bli')) %>%
dplyr::select(sciname, iucn_sid, spp_group) %>%
unique()
extract_loiczid_per_spp(maps_list_iucn,
shp_dir = file.path(dir_data_iucn, 'iucn_shp'),
fn_tag = scenario,
reload = FALSE)
### set up maps_list for all bird IUCN species...
maps_list_bli <- spp_all %>%
filter(str_detect(spatial_source, 'bli')) %>%
dplyr::select(sciname, iucn_sid, spp_group) %>%
mutate(spp_group = 'BOTW') %>%
unique()
extract_loiczid_per_spp(maps_list_bli,
shp_dir = dir_data_bird,
fn_tag = scenario,
reload = FALSE)
For each half-degree cell, tally up the number of species present and determine a mean species risk value and population trend value for the cell.
The following code chunk executes the functions that perform these tasks. Note the optional arguments fn_tag
and prob_filter
that can be changed to facilitate custom runs (including different spp_all
species info lists, different AquaMaps thresholds, and different filename tags to uniquely identify the custom run)
spp_all <- read_csv(file.path(dir_anx, scenario, 'int/spp_all_cleaned.csv'))
am_cells_spp_sum <- process_am_summary_per_cell(spp_all, fn_tag = '', prob_filter = 0, reload = FALSE) %>%
read_csv(col_types = 'dddddc')
### NOTE: keyed data.table works way faster than the old inner_join or merge.
### loiczid | mean_cat_score | mean_trend_score | n_cat_species | n_trend_species
### AM does not include subspecies or subpops: every am_sid corresponds to exactly one sciname.
iucn_cells_spp_sum <- process_iucn_summary_per_cell(spp_all, fn_tag = '', reload = FALSE) %>%
read_csv(col_types = 'dddddc')
### loiczid | mean_cat_score | mean_trend_score | n_cat_species | n_trend_species
### IUCN includes subpops - one sciname corresponds to multiple iucn_sid values.
sum_by_loiczid_file <- process_means_per_cell(am_cells_spp_sum, iucn_cells_spp_sum, fn_tag = '')
### This returns location of dataframe with variables:
### loiczid | weighted_mean_cat | weighted_mean_trend | n_cat_spp | n_tr_spp
loiczid | mean_cat_score | mean_pop_trend_score | n_cat_species | n_trend_species | source |
---|---|---|---|---|---|
8205 | 0 | NaN | 1 | 0 | aquamaps |
8206 | 0 | NaN | 1 | 0 | aquamaps |
8207 | 0 | NaN | 1 | 0 | aquamaps |
8209 | 0 | NaN | 1 | 0 | aquamaps |
8210 | 0 | NaN | 1 | 0 | aquamaps |
8211 | 0 | NaN | 1 | 0 | aquamaps |
loiczid | mean_cat_score | mean_pop_trend_score | n_cat_species | n_trend_species | source |
---|---|---|---|---|---|
1 | 0.4 | -0.5 | 1 | 1 | iucn |
2 | 0.4 | -0.5 | 1 | 1 | iucn |
3 | 0.4 | -0.5 | 1 | 1 | iucn |
4 | 0.4 | -0.5 | 1 | 1 | iucn |
5 | 0.4 | -0.5 | 1 | 1 | iucn |
6 | 0.4 | -0.5 | 1 | 1 | iucn |
Finally we take the two cell-by-cell summaries and combine, using a species-count weighting to determine the mean category and trend per cell. Cells are aggregated to regions, to calculate an area-weighted regional mean category, trend, and status.
These are then saved to status and trend layer outputs for global (shown in table) as well as 3 nautical mile, Antarctic, and High Seas regions.
The script spp_ico/v2016/layer_prep_spp_global.R
performs these tasks.
source(file.path(dir_git, scenario, 'layer_prep_spp_global.R'))
These analyses are repeated for additional scenarios: 3 nautical mile coastal buffer (for resilience calculations), High Seas, and Antarctic.
source(file.path(dir_git, scenario, 'layer_prep_spp_3nm.R'))
source(file.path(dir_git, scenario, 'layer_prep_spp_hs_aq.R'))
spp_all_nobirds <- read_csv(file.path(dir_anx, scenario, 'int/spp_all_cleaned.csv')) %>%
filter(!(spp_group == 'BOTW' & is.na(am_sid))) %>% ### remove any BOTW with no AquaMaps map
mutate(spatial_source = ifelse(spp_group == 'BOTW', 'am', spatial_source))
am_cells_spp_sum_nobirds <- process_am_summary_per_cell(spp_all_nobirds, fn_tag = 'nobirds', prob_filter = 0, reload = FALSE) %>%
read_csv(col_types = 'dddddc')
### NOTE: keyed data.table works way faster than the old inner_join or merge.
### loiczid | mean_cat_score | mean_trend_score | n_cat_species | n_trend_species
### AM does not include subspecies or subpops: every am_sid corresponds to exactly one sciname.
iucn_cells_spp_sum_nobirds <- process_iucn_summary_per_cell(spp_all_nobirds, fn_tag = 'nobirds', reload = FALSE) %>%
read_csv(col_types = 'dddddc')
### loiczid | mean_cat_score | mean_trend_score | n_cat_species | n_trend_species
### IUCN includes subpops - one sciname corresponds to multiple iucn_sid values.
sum_by_loiczid_file_nobirds <- process_means_per_cell(am_cells_spp_sum_nobirds, iucn_cells_spp_sum_nobirds, fn_tag = 'nobirds')
### This returns location of dataframe with variables:
### loiczid | weighted_mean_cat | weighted_mean_trend | n_cat_spp | n_tr_spp
source(file.path(dir_git, scenario, 'layer_prep_spp_global_nobirds.R'))
library(ggplot2)
nobirds_df <- read_csv(file.path(dir_git, scenario, 'output/spp_status_global.csv')) %>%
rename(status = score) %>%
left_join(read_csv(file.path(dir_git, scenario, 'output/spp_status_global_nobirds.csv')) %>%
rename(status_nobirds = score),
by = 'rgn_id') %>%
left_join(read_csv(file.path(dir_git, 'v2015', 'data/spp_status_global.csv')) %>%
rename(status_2015 = score),
by = 'rgn_id')
scatter_nobirds <- ggplot(nobirds_df, aes(x = status_nobirds, y = status)) +
geom_point(alpha = .5) +
geom_point(aes(x = status_nobirds, y = status_2015), color = 'blue', alpha = .5) +
geom_abline(color = 'red') +
scale_x_continuous(limits = c(.5, 1)) +
scale_y_continuous(limits = c(.5, 1)) +
labs(x = 'Status: v2016 excluding Bird Life data',
y = 'Status: v2015(blue), v2016 all (black)',
title = 'SPP Status: excluding birds')
ggsave(file.path(dir_git, scenario, 'Figs/scatterplot_spp_status_global_excl_bli.png'),
plot = scatter_nobirds)
The calc_rgn_spp()
function takes in lookup tables of species by cell (for both IUCN and AM), a cell-to-region lookup, and a species info lookup. From this it generates a list of which species occur in which regions, including basic species information.
iucn_sid | am_sid | sciname | pop_cat | pop_trend | spatial_source | rgn_id | rgn_name | n_cells | presence | n_spp_rgn |
---|---|---|---|---|---|---|---|---|---|---|
NA | Fis-156671 | Abalistes filamentosus | NA | NA | am | 210 | Japan | 194 | NA | 11503 |
NA | Fis-156671 | Abalistes filamentosus | NA | NA | am | 20 | South Korea | 35 | NA | 5861 |
NA | Fis-156671 | Abalistes filamentosus | NA | NA | am | 255 | DISPUTED | 145 | NA | 18267 |
NA | Fis-156671 | Abalistes filamentosus | NA | NA | am | 209 | China | 151 | NA | 10906 |
NA | Fis-156671 | Abalistes filamentosus | NA | NA | am | 14 | Taiwan | 67 | NA | 10938 |
NA | Fis-156671 | Abalistes filamentosus | NA | NA | am | 207 | Vietnam | 113 | NA | 10348 |
The following plots compare the status scores generated for the 2015 assessment to those generated for 2016.
sciname
field for joining. I’ve added a step to add iucn_sid
values according to sciname
, to allow the species-cell lookup to work with the v2016 scripts. Much of the variation in the plot may be due to differences in name-matching.The third examines one possible reason for the large shift in scores between the 2015 scores (d2014 data) and 2016 scores (d2015 data): the addition of birds to the IUCN spatial information. Bird species have a fairly low area-weighted mean risk category, but include a very large number of total cells of coverage. This means these lower-risk species have a very large impact on the final score compared to other species groups.
Finally, this plot compares the 2016 scores as calculated with and without BirdLife International data. AquaMaps bird species were left in. Blue points compare 2015 scores (which did not include BirdLife International data) to the 2016 scores excluding BirdLife International data.