Figures & data
Table 1. Metrics computed for each classification dataset.
Table 2. Mathematical notation and definitions.
Figure 3. Bi-plots of the first two principal components for the evaluation metrics for each algorithm. The red vectors are the projections of the original metrics onto these components, and each label represents a dataset in these new coordinates.
![Figure 3. Bi-plots of the first two principal components for the evaluation metrics for each algorithm. The red vectors are the projections of the original metrics onto these components, and each label represents a dataset in these new coordinates.](/cms/asset/a4fed99a-2e1b-4f24-bcef-ef936e40f18b/uaai_a_1430993_f0003_oc.jpg)
Figure 4. Dataset ranking, as intervals with the minimum, mean and maximum kappa scores among the five algorithms used.
![Figure 4. Dataset ranking, as intervals with the minimum, mean and maximum kappa scores among the five algorithms used.](/cms/asset/4db5f480-4fa2-4abf-a6f9-ab3cade7cf3c/uaai_a_1430993_f0004_b.gif)
Figure 5. Density estimation for accuracy and kappa for the datasets. In the case of kappa, the red area represents the proportion of datasets with negative kappa, that is, the times that the model is unable to outperform a trivial educated guess.
![Figure 5. Density estimation for accuracy and kappa for the datasets. In the case of kappa, the red area represents the proportion of datasets with negative kappa, that is, the times that the model is unable to outperform a trivial educated guess.](/cms/asset/f8c45ff2-2e86-410d-abc7-26f7f55de030/uaai_a_1430993_f0005_oc.jpg)
Figure 6. Number of highest/lowest kappa scores for each classifier. The left figure indicates the number of times (normalized to one) a classifier has been the highest (green) and lowest (red) scoring model. The right figure shows their highest/lowest ratios.
![Figure 6. Number of highest/lowest kappa scores for each classifier. The left figure indicates the number of times (normalized to one) a classifier has been the highest (green) and lowest (red) scoring model. The right figure shows their highest/lowest ratios.](/cms/asset/6a1f2be5-bf71-4a77-bd91-80b9221954fa/uaai_a_1430993_f0006_oc.jpg)
Figure 7. Representation of the (weighted) distribution of the worst (left figure) and best (right figure) performing algorithms according to the variance between the results of the five of them: the datasets with more accuracy/kappa agreement between algorithms lie on the left, whereas the ones with more disagreement are on the right side of the horizontal axis.
![Figure 7. Representation of the (weighted) distribution of the worst (left figure) and best (right figure) performing algorithms according to the variance between the results of the five of them: the datasets with more accuracy/kappa agreement between algorithms lie on the left, whereas the ones with more disagreement are on the right side of the horizontal axis.](/cms/asset/b6780373-f1ab-4189-b193-62e2b9ca4aeb/uaai_a_1430993_f0007_oc.jpg)