ABSTRACT
Although various methods are currently available for modelling the habitat preferences of aquatic biota, studies comparing the performance of data-driven habitat models are limited. In this study, we assembled a benthic-macroinvertebrate microhabitat-preference dataset and used it to evaluate the predictive accuracy of regression-based univariate Habitat Suitability Curves (HSC), Boosted Regression Trees (BRT), Random Forests (RF), fuzzy-logic-based models using the weighted average (FLWA), maximum membership (FLMM), mean of maximum (FLM) and centroid (FLC) defuzzification algorithms and fuzzy rule-based Bayesian inference (FRB). The results show that the BRT model was the most accurate, closely followed by RF, FRB, FLM and FLMM while the FLC and FLWA algorithms had the lowest performance. However, due to the imbalanced nature of the dataset and in contrast to the fuzzy rule-based algorithms, the HSC, BRT and RF models failed to accurately predict the habitat suitability in low-scored microhabitats. We conclude that, given balanced datasets, BRT and RF can be effectively used in habitat suitability modelling. For imbalanced datasets, a properly adjusted RF model can be applied but when the input dataset is large enough to provide sufficient data-driven IF–THEN rules to train an FRB, FLMM or FLM algorithm, these models will produce the most accurate predictions.
Acknowledgements
This research is part of a PhD project conducted in the Hellenic Centre for Marine Research (HCMR) under the supervision of the National Technical University of Athens (NTUA) and in close collaboration with the Technical University of Munich (TUM). The authors would like to thank the Institute of Marine Biological Resources and Inland Waters of the HCMR for providing the field sampling equipment and the NTUA and TUM for communicating the hydraulic experience and expertise to complete this project.
Disclosure statement
No potential conflict of interest was reported by the authors.