449
Views
12
CrossRef citations to date
0
Altmetric
Original Articles

How reliable are species identifications in biodiversity big data? Evaluating the records of a neotropical fish family in online repositories

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
 

Abstract

The increase of free and open online biodiversity databases is of paramount importance for current research in ecology and evolution. However, little attention is paid to using updated taxonomy in these “biodiversity big data” repositories and the quality of their taxonomic information is often questioned. Here we assess how reliable is the current use of nomenclatural classification in the distributional information available from two biodiversity information networks: GBIF and the Brazilian SpeciesLink. We use as a study case the records of Auchenipteridae, a Neotropical fish family that has been subject to recent taxonomical reviews. A data filtering procedure was applied to identify and quantify the inaccuracies in the taxonomical status of the records in three steps: assessment of identification accuracy at the family, genus or species level; current validity of species name; and assignation of inaccurate species records to different categories of classification quality. Synonyms, nonexistent combinations, and outdated combinations were reassigned to currently valid species. A total of 9148 records of Auchenipteridae fishes were analyzed, of which 4165 were from GBIF and 4983 from SpeciesLink, deriving from 46 and 31 sources, respectively. After correcting all possible records following the taxonomic data filtering steps, 6988 records (76.4% of the original) were adequate for describing species distributions, while 2160 remained inaccurate. The most inaccurate records at the species level were due to the use of outdated nomenclatures, resulting in non-valid combinations of species and genus, and synonymy. Our results evidence a large taxonomic inconsistency among records, and, most importantly, that taxonomic information obtained from repositories should be used with caution. Many inaccuracy issues may be embedded in the biodiversity databases’ records, which could lead researchers to provide an incomplete or even mistaken perspective of the variations in the natural world.

Acknowledgements

We wish to thank Juliana Stropp, Guilherme Dutra, and an anonymous reviewer for their help in improving this manuscript. The authors are grateful to the financial support provided by several grants from the Brazilian Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES - Finance Code 001; TMSF - 23038.042984/2008-30), and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq; PDM - 308694/2015-5; LFAM - 305017/2016-0; JH – PVE 314523/2014-6).

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplemental data

Supplemental data for this article can be accessed here: http://dx.doi.org/10.1080/14772000.2020.1730473.

Associate Editor: Kevin Conway

Additional information

Funding

This research was supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES - Finance Code 001) under Grant (TMSF − 23038.042984/2008-30); CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico) under Grant (PDM − 308694/2015-5; LFAM − 305017/2016-0; JH − 314523/2014-6).

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.