Publication Cover
Spectroscopy Letters
An International Journal for Rapid Communication
Volume 55, 2022 - Issue 1
174
Views
2
CrossRef citations to date
0
Altmetric
Research Articles

Improving classification performance of extreme gradient boosting on small-sized dataset to classify Turkish and Italian wines along with elemental profiling by inductively coupled plasma-mass spectrometry

&
Pages 1-12 | Received 24 Mar 2021, Accepted 15 Nov 2021, Published online: 03 Dec 2021
 

Abstract

In this study, the classification performance of the extreme gradient boosting algorithm on a small-sized dataset was improved by using a synthetically generated dataset created with kernel density estimation to classify wine samples. The concentration of 29 elements in wine samples produced in Turkey (domestic) and Italy (imported) was determined by inductively coupled plasma-mass spectrometry and obtained results were used to generate the dataset. Classification of wine samples was firstly assessed with extreme gradient boosting, which is known for overfitting in small-sized datasets, resulting in poor classification performance. To improve the classification performance, a synthetic dataset was created and the algorithm was trained on the synthetic dataset instead of the original dataset. With the proposed method, the accuracy of the model was improved from 76.7% to 81.7%. The precision values for Turkish and Italian wines were increased from 78.4% to 84.1% and from 70.9% to 79.4%, respectively. The variable importance determined by the extreme gradient boosting algorithm showed that beryllium and cesium were significantly more important compared to other elements followed by tin, phosphorus, cobalt, lead, calcium, copper, zinc, and aluminum as the top 10 elements to classify Turkish and Italian wine samples.

Acknowledgments

The authors thank Cagri Latifoglu (TED University) for valuable suggestions on the manuscript.

Disclosure statement

The authors declare that they have no conflict of interest.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.