# install.packages('duckplyr')
# install.packages('geodata')
library(tidyverse) # for data manipulation
library(duckplyr) # for fast data processing
library(phytools) # for phylogenetic regression
library(lme4) # for linear models
library(rnaturalearth)
library(sf)
library(raster)
Mapping Species Richness: Integrating Occurrence Data, Climatic Variables, and Phylogenetic Insights in a Global Grid Analysis - A minimal example
In this notebook, there is a minimal tutorial to a spatial analysis pipeline that maps species occurrence points to WorldClim variables and overlays them on a global grid to quantify species richness per cell. There is also examples about using GLM, GLMM, and phylogenetic regression, to examine how climate variation to estimate an ordinal measure of ant polymorphism.
Ant polymorphism, Phylogenetic regression, Global biodiversity, Ecological complexity, Quantitative ecology
1 Getting started
Before you start:
Make sure you have the latest version of R installed.
Open R in any IDE of your choosing (Rstudio, VScode, Jupyter, etc… )
1.1 Clone the repository
- Clone the GitHub repository
Go to the GitHub repository page: https://github.com/lessardlab/Mapping-Ant-Species-Richness
Copy the https link of the repository: https://github.com/lessardlab/Mapping-Ant-Species-Richness.git
Open the terminal in your computer
Set the directory where you want to clone the repository
Type
git clone https://github.com/lessardlab/Mapping-Ant-Species-Richness.git
Hit enter
Done! You have cloned the repository in your computer.
Make sure to have Quarto installed if you go this route. For Rstudio users, Quarto comes preinstalled, for VScode and others, you need to download the Quarto extension.
2 Dependencies
To replicate this tutorial, make sure you have the following packages. To install a package, use install.packages('package_name')
(Note you need to do it only once)
3 Sourcing data
For this tutorial we will use the ant polymorphism database publised as part of the article: LaRichelliere et al., 2023. Warm regions of the world are hotspots of superorganism complexity
The dataset is open and public. You can download your own copy of the data by cloning the paper GitHub repository: https://github.com/lessardlab/GlobalPolyMorp
# Source data on global ant polymorphism.
<- duckplyr_df_from_csv("Lat-Long_Data_GABI.csv")
my_ant_data
summary(my_ant_data)
duckplyr: materializing
gabi_acc_number valid_species_name country dec_lat
Length:743211 Length:743211 Length:743211 Min. :-55.083
Class :character Class :character Class :character 1st Qu.: -8.367
Mode :character Mode :character Mode :character Median : 14.481
Mean : 14.816
3rd Qu.: 37.845
Max. : 88.416
dec_long elevation bentity2_name
Min. :-180.00 Min. : -80.0 Length:743211
1st Qu.: -85.02 1st Qu.: 130.0 Class :character
Median : -59.64 Median : 500.0 Mode :character
Mean : -19.46 Mean : 669.8
3rd Qu.: 34.87 3rd Qu.:1090.0
Max. : 179.97 Max. :5300.0
NA's :151 NA's :340610
head(my_ant_data)
duckplyr: materializing
# A tibble: 6 × 7
gabi_acc_number valid_species_name country dec_lat dec_long elevation
<chr> <chr> <chr> <dbl> <dbl> <dbl>
1 GABI_01146836 Myrmecocystus.semirufus USA 36.5 -117. -80
2 GABI_01026649 Myrmecocystus.semirufus USA 36.5 -117. -80
3 GABI_01033314 Myrmecocystus.semirufus USA 36.5 -117. -80
4 GABI_01174725 Myrmecocystus.semirufus USA 36.5 -117. -80
5 GABI_01130976 Pogonomyrmex.californicus USA 36.5 -117. -80
6 GABI_01032969 Pogonomyrmex.californicus USA 36.5 -117. -80
# ℹ 1 more variable: bentity2_name <chr>
Let’s start with tidying the dataset. For instance, we can separate the Genus, species, and species name
<-
my_ant_data |>
my_ant_data mutate(Genus = str_extract(valid_species_name, "^([^.]+)"),
species = str_extract(valid_species_name, "([^.]+)$"),
species_name_no_dot = str_replace(valid_species_name, "\\.", " "))
|>
my_ant_data ::select(Genus, species, species_name_no_dot ) |>
duckplyr::datatable() DT
duckplyr: materializing