ABSTRACT
Predictive models such as species distribution models (SDMs) are increasingly applied to inform conservation efforts and management decisions. A thoughtful predictor choice, and hence the variable selection process, is a challenge when modeling large communities. Often, variable choice is made for an entire community and not for specific species, resulting in less appropriate predictors for at least some species and affecting model performance and predicted distributions. Using 2 German river catchments as a model system, we investigated (1) application of boosted regression trees (BRTs) as a variable selection procedure to choose the optimal set of environmental predictors and (2) whether model performance is increased by applying custom-made predictor sets to individual species. From a community of 67 benthic macroinvertebrate species, 10 increased in accuracy with the customized predictor set and 10 species decreased in accuracy. Notably, current preference, stream notation, and functional group differed between these species’ groups, which correspond to varied environmental conditions in their known occurrence sites. The species that increased in accuracy showed a preference toward lowland conditions and were far less widespread than the species that decreased in accuracy. We conclude that BRTs are a useful tool for selecting variables for SDMs on large communities. Also, for specialist, rare, or invasive species, determining a species-specific custom-made predictor set may be preferable because species’ preferences may not be representative of the entire study area. Our study describes a structured variable selection approach that can be readily implemented to predict species distributions informative for river management decisions.
Acknowledgements
This work was supported by the German Federal Ministry of Education and Research (BMBF) as part of the project “Global Change Effects in River Ecosystems” (GLANCE, no. 01LN1320A) project. We thank the German federal state environmental agencies for providing biological data. For useful advice on the SDM package we thank Babak Naimi. For help with data collection we thank Melissa Schulte, and for data formatting we thank Judith Mahnkopf.
Disclosure statement
No potential conflict of interest was reported by the authors.
ORCID
Katie Irving http://orcid.org/0000-0002-6582-7979
Sonja C. Jähnig http://orcid.org/0000-0002-6349-9561
Mathias Kuemmerlen http://orcid.org/0000-0003-1362-3701