How to efficiently offer support/resources to train students who might start studying SDM approaches but need a lot of guidance to get going?

On this page

By: Courtney Burk, Alana Westwood, and Robert Buchkowski

So, you want to learn how to create and utilize SDMs.? First, what is an SDM? An SDM, or a species distribution model is a now common method in the fields of biogeography, ecology, and conservation to predict species distribution, movement, and/or abundance across a landscape. You’ve probably heard several terms that all fall under the umbrella of species distribution modelling, including habitat models, climate envelopes, ecological niche models (ENMs), resource selection functions (RSFs), bioclimatic models, connectivity models, and range maps. SDMs use algorithms to fit mathematical functions that describe species presence/absence across environmental space and the relationship to each environmental variable and then generate predictive maps of species distribution in geographical space, aka predictive forecasting (Figure 1). In some cases, SDMs are used to explain or understand how environmental conditions influence the occurrence or abundance of a species.

Figure provided by Zurell, 2020.
Figure 1: The main modelling cycle in species distribution modelling (Zurell, 2020).

There are two main data components to an SDM: (1) species data and (2) environmental data. Species data can comprise presence-only data (just where that species has been observed), presence-absence data (known presence and absence locations based on surveys), abundance-absence data (which includes counts), or occupancy data (repeated site surveys). Regardless of the data type you need to have coordinates of the locations where the species of interest occurred (or didn’t occur). These can be archived species location data, such as from museum records or citizen science initiatives (e.g., ebird, iNaturalist, FrogWatch, etc.), or a new dataset you collect in the field. Environmental data is both the biotic and abiotic conditions that describe where a species is present or absent. Now, there are dozens of variables to choose from to incorporate into your SDM. The variables you choose largely depend on the scale of your model and the completeness of data available.

But how much data do you need? Where do you get this data? What do I do? Unfortunately, this is not the place to learn all the ins and outs for SDMs.

As you explore SDMs, there are hundreds of resources out there and available to help you meet your needs with learning how to utilize this powerful data tool to your advantage. However, we have put together a shortlist of platforms and base programs to choose from (Table 1). Each of them has their benefits, caveats, and data requirements. Table 1, a non-exhaustive list, describes some commonly used platforms, required data, and links to further tutorials or readings to help you on your way!

Table 1: Modelling methods, type of species data required, and suggested further reading.

Method Model Name/Software Type Species Data Type Links for Further Learning
Climatic envelope BIOCLIM Probability Presence-only R tutorial; Booth, 2019
Gower metric DOMAIN Probability Presence-only Carpenter et al., 1993; R tutorial
Ecological niche factor analysis (ENFA) BIOMAPPER Suitability Presence/background BioMapper tutorial
Maximum entropy MAXENT Suitability Presence/background YouTube tutorial; Maxent step by step
Genetic algorithm GARP Suitability Presence/pseudoabsence Haase et al., 2021; Tutorial
Regression: generalized linear model (GLM) and generalized additive model (GAM) GRASP Probability Presence/absence Tutorial; Guisan et al., 2002
Artificial neural network (ANN) SPECIES, MATLAB Probability Presence/absence Introduction; YouTube tutorial
Classification and regression trees (CART) BIOMOD Probability Presence/absence Tutorial
Multivariate adaptive regression splines (MARS) MARS (with R) Probability Presence/absence Introduction; Elith and Leathwick, 2007; R tutorial
Favourability function (FF) FF (with R) Probability Presence/absence Introduction; R tutorial
Random forest (RF) RF (with R) Probability Presence/background Valavi et al., 2021
Boosted regression trees (BRT) BRT (with R) Probability Presence/absence R tutorial; Elith et al., 2008
Bayesian species distribution models spOccupancy (with R) Probability Presence/absence Single-Species, Multi-Species, and Integrated Spatial Occupancy Models

Choosing a model method.
Video: Introduction to SDM.