ABSTRACT
Galaxiid post-larvae constitute five of the six species in New Zealand’s iconic whitebait fishery. Distinguishing the five species needs to occur at a younger age than is convenient for easy identification. Traditional identification uses subjective characteristics such as colouration of fresh specimens and fin position, but supervised classification methods could be more accurate and less labour-intensive. We compared the accuracy of six methods (multinomial logistic regression, linear discriminant analysis, quadratic discriminant analysis, naive Bayes, decision tree, and random forest) using total length, wet weight, body depth, and date, latitude and longitude of capture as predictive variables. Four of the galaxiid species were represented in 17,546 observations, but the other species was too rare to analyse. The best method, determined using 10-fold cross-validation classification accuracy (95.2% overall), was random forest across all species. The most difficult species to classify correctly (giant kōkopu) was the rarest species included in data with 66.1% accuracy at best. In addition to examining overall accuracy, we show how use of a cost function can improve classification performance with respect to rare species. This research could improve the efficiency of monitoring the composition of the whitebait fishery, and thus management of this occasionally overfished group of fish.
Acknowledgements
This work was done to meet the requirements of the MSc degree at the University of Canterbury. We thank the Ministry of Business, Innovation and Employment for funding this research (C01X1002). Particular thank you to Mark Yungnickel and Georgia McClintock for use of morphometric data from their own MSc studies.
Disclosure statement
No potential conflict of interest was reported by the author(s).