Biosecurity Alerts GBIF Data Use Club Seminar 2024 Callum Waite
Erin Roger
Shandiya Balasubramaniam
We acknowledge the Traditional Owners of the lands on which we live and work, and pay our respects to Elders past and present. We recognise the spiritual and cultural significance of land, water, and all that is in the environment to Traditional Owners, and their continuing connection to Country.
• One of several facilities funded by the Aus govt for national research infrastructure
• Over 850 data providers & weekly ingest of datasets
• Citizen science is our fastest growing data source
• 5,000-10,000 ya
• Dingo (Canis familiaris ssp. Dingo)
ALA hosts 2300+ introduced species & 1.9+ million occurrences of pests, weeds, and diseases
Red eared slider
236 occurrences
Lantana
36,729 occurrences
{koel} facilitates the process of searching for taxa within spatial and temporal constraints, summarising this information in a table, and sending the table as an email
Workflow
1
Ingest &
process lists
2
Search for
occurrences
3
Filter & download
occurrences
4
Compile into a table
& send email
From a list…
correct_name | provided_name | synonyms | common_name | state | lga | shape |
---|---|---|---|---|---|---|
Solenopsis invicta | Solenopsis invicta | NA | Red Imported Fire Ant | AUS | NA | NA |
Austropuccinia psidii | Austropuccinia psidii | Uredo rangelii | Myrtle Rust | QLD | NA | NA |
Psittacula krameri | Psittacula krameri | NA | Indian ringneck parrot | VIC, TAS | NA | NA |
Leucanthemum vulgare | Leucanthemum vulgare | Chrysanthemum leucanthemum | Ox-Eye Daisy | NA | Darwin Municipality | NA |
Anoplophora | Anoplophora spp. | NA | Exotic Longhorn Beetles | NA | City of Marion, City of Holdfast Bay | NA |
Rhinella marina | Rhinella marina (Linnaeus, 1758) | Bufo marinus, Rana marina | Cane Toad | NA | NA | QLD_Protected_areas |
Erica lusitanica | Erica lusitanica | NA | Spanish Heath | VIC | Lithgow City Council | NA |
… to an email
Complexities
in coding
Taxonomic
challenges Cleaning provided taxon names
clean_names <- function(name) {
cleaned_name <- name |>
gsub("\u00A0", " ", .) |> # remove non-ASCII whitespaces (NBSP)
gsub("\u200B", " ", .) |> # ... (ZWSP)
gsub("\n", " ", .) |> # replace line breaks with spaces
gsub(";", ",", .) |> # replace semi-colons with commas
gsub(" ,", ",", .) |> # remove spaces before commas
gsub("\\s{2,}", " ", .) |> # remove multiple spaces
gsub(",$", "", .) |> # remove trailing commas
gsub(" +$", "", .) |> # remove trailing spaces
gsub(",(\\w)", ", \\1", .) |> # add spaces between commas and text
gsub(" sp\\.", "", .) |>
gsub(" spp\\.", "", .) |> # remove spp. and sp. abbreviations
str_squish(.)
return(cleaned_name)
}
Taxonomic
challenges Alerting on different ranks
fields <- c("genus", "species", "subspecies", "scientificName")
request_data() |>
galah_filter(firstLoadedDate >= upload_date_start,
firstLoadedDate <= upload_date_end,
eventDate >= event_date_start,
eventDate <= event_date_end,
{{field}} == search_terms) |>
galah_select(scientificName, vernacularName,
genus, species, subspecies,
decimalLatitude, decimalLongitude,
cl22, cl10923, cl1048, cl966, cl21,
firstLoadedDate, basisOfRecord,
group = c("basic", "media")) |>
collect() |>
mutate(match = field,
search_term = .data[[field]],
across(-c(images, sounds, videos), as.character),
across(c(images, sounds, videos), as.list))
Temporal
challenges
request_data() |>
galah_filter(firstLoadedDate >= upload_date_start,
firstLoadedDate <= upload_date_end,
eventDate >= event_date_start,
eventDate <= event_date_end,
{{field}} == search_terms) |>
galah_select(scientificName, vernacularName,
genus, species, subspecies,
decimalLatitude, decimalLongitude,
cl22, cl10923, cl1048, cl966, cl21,
firstLoadedDate, basisOfRecord,
group = c("basic", "media")) |>
collect() |>
mutate(match = field,
search_term = .data[[field]],
across(-c(images, sounds, videos), as.character),
across(c(images, sounds, videos), as.list))
Spatial
challenges
correct_name | provided_name | synonyms | common_name | state | lga | shape |
---|---|---|---|---|---|---|
Solenopsis invicta | Solenopsis invicta | NA | Red Imported Fire Ant | AUS | NA | NA |
Austropuccinia psidii | Austropuccinia psidii | Uredo rangelii | Myrtle Rust | QLD | NA | NA |
Psittacula krameri | Psittacula krameri | NA | Indian ringneck parrot | VIC, TAS | NA | NA |
Leucanthemum vulgare | Leucanthemum vulgare | Chrysanthemum leucanthemum | Ox-Eye Daisy | NA | Darwin Municipality | NA |
Anoplophora | Anoplophora spp. | NA | Exotic Longhorn Beetles | NA | City of Marion, City of Holdfast Bay | NA |
Rhinella marina | Rhinella marina (Linnaeus, 1758) | Bufo marinus, Rana marina | Cane Toad | NA | NA | QLD_Protected_areas |
Erica lusitanica | Erica lusitanica | NA | Spanish Heath | VIC | Lithgow City Council | NA |
Broader
insights
Modular code
Broader
insights
Modular code
occ_list <- species_records |>
filter(!is.na(decimalLatitude) & !is.na(decimalLongitude)) |>
identify_aus() |>
identify_state() |>
identify_shape(shapes_path = shapes_path) |>
identify_lga() |>
filter(state == "AUS" |
(!is.na(state) & flagged_state) |
(!is.na(lga) & flagged_lga) |
(!is.na(shape) & flagged_shape)) |>
select(-flagged_state,-flagged_lga,-flagged_shape) |>
exclude_records() |>
as_tibble()
Broader
insights
Unit tests
In 12
months…
There is no doubt that the biosecurity alert system has improved our statewide surveillance capability [in Queensland]. While we have only been using it for a short period we have already recorded several significant detections. I’ve been promoting the system at every opportunity.
- Steve Csurhes, Biosecurity Queensland
”
Slides: shandiya.quarto.pub/datauseclub2024
Code: github.com/shandiya/DataUseClub2024