Figures & data
Table 1. Detailed database information. Missing analysis in regression. Feature (Feat.), Integer (Int. and Continuous (Cont.).
Figure 3. On the left: original histogram of the feature values before missing data simulation and histogram after missing data simulation. On the right: the probability function used to induce biased missing.
![Figure 3. On the left: original histogram of the feature values before missing data simulation and histogram after missing data simulation. On the right: the probability function used to induce biased missing.](/cms/asset/55ed82ef-b067-43c4-832d-509d1b66830e/uaai_a_2032925_f0003_b.gif)
Table 2. Percentage difference means for the regression datasets (according referenced in ). 5% and 95% confidence intervals in parenthesis.
Figure 4. As the number of features with missing data increases, Discard rows become a worse choice than the imputation methods (left). As the proportion of features with missing data () increases the outcome in data regression gets worse (right).
![Figure 4. As the number of features with missing data increases, Discard rows become a worse choice than the imputation methods (left). As the proportion of features with missing data (%A) increases the outcome in data regression gets worse (right).](/cms/asset/a750d2fa-7d97-4927-a40c-5cdeaeca1345/uaai_a_2032925_f0004_b.gif)
Figure 6. Winner methods separated by ,
, databases and regressors (AdaBoost, DecisionTree, KNeighbors, MLP, RandonForest, SVR).
![Figure 6. Winner methods separated by %A, %P, databases and regressors (AdaBoost, DecisionTree, KNeighbors, MLP, RandonForest, SVR).](/cms/asset/fe12f27c-6c8a-4460-8b35-6644cb76a53b/uaai_a_2032925_f0006_b.gif)
Table 3. Percentage diference means for the regression databases. 5% and 95% confidence intervals in parenthesis.