Authors: Lei Song*, Clark University
Topics: Spatial Analysis & Modeling, Biogeography, Physical Geography
Keywords: Species distribution modeling (SDM), Isolation Forest, Machine learning, Conservation
Session Type: Paper
Presentation File: No File Uploaded
This research assesses the application of the Isolation Forest, an anomaly-detection machine learning algorithm, to species distribution modeling (SDM). Most algorithms of presence-only SDMs focus on normal instances, and typically rely on distance or density measures, which requires enormous computational costs due to big datasets. This limits the usage of recent spatial data with high resolution and the species occurrence data mixed with imperfect detections and sampling bias. To alleviate these problems, I apply and assess the usage of Isolation Forest algorithm that tolerates observation defects within the training data and costs less for calculation. I describe experiments of Isolation Forest on 20 hyperdominant, 20 dominant, and 10 rare Amazon tree species. I especially test the robustness of Isolation Forest to use the presence-only data by comparing the modeled results and evaluation measures based on presence-only and presence-absence data. I also assess the efficiency of Isolation Forest regarding the size of training data, the complexity of tuning parameters, and the training time. I further apply the permutation feature importance to evaluate variable importance, and examine the interpretability of the models. These results provide a frame of reference for conservation practioners as well as decision makers to model the species distribution with Isolation Forest and to better interpret the results.