REFERENCES
- Arbuckle , J. L. and W. Wothke . 1999 . Amos 4.0 User's Guide . Chicago , IL : Smallwaters .
- Batista , G. and M. C. Monard . 2003 . An analysis of four missing data treatment methods for supervised learning . Applied Artificial Intelligence 17 : 519 – 533 .
- Becker , R. , J. Chambers , and A. Wilks . 1988 . The New S Language – A Programming Environment for Data Analysis and Graphics . Wadsworth & Brooks/Cole , Pacific Grove , CA, USA .
- Blake , C. L. and C. J. Merz . 1998 . UCI Repository of machine learning databases . University of California, Department of Information and Computer Science , Irvine , CA . (http:/www.ics.uci.edu/mlearn/MLRepository.html) .
- Breiman , L. 1996. Bagging predictors. Machine Learning 26(2):123–140.
- Breiman , L. , J. Friedman , R. Olshen, and C. Stone . 1984 . Classification and Regression Trees . Wadsworth International Group. Wadsworth & Brooks/Cole Advanced Books & Software , Pacific Grove , CA, USA .
- Cestnik , B. , I. Kononenko , and I. Bratko . 1987 . Assistant 86: A knowledge-elicitation tool for sophisticated users . In I. Bratko and N. Lavrac , eds. European Working Session on Learning – EWSL87 . Wilmslow , UK : Sigma Press .
- Cheeseman , P. , J. Kelly , M. , Self , J. , Stutz , W. Taylor , and D. Freeman . 1988 . Bayesian classification . In Proceedings of American Association of Artificial Intelligence (AAAI) . San Mateo , CA : Morgan Kaufmann Publishers , 607 – 611 .
- Dempster , A. P. , N. M. Laird , and D. B. Rubin . 1977 . Maximum likelihood estimation from incomplete data via the EM algorithm . Journal of the Royal Statistical Society, Series B 39 : 1 – 38 .
- El-Emam , K. and A. Birk . 2000 . Validating the ISO/IEC 15504 measures of software development process capability . Journal of Systems and Software 51 ( 2 ): 119 – 149 .
- Fujikawa , Y. and T. B. Ho . 2002 . Cluster-based algorithms for filling missing values . In 6th Pacific-Asia Conf. on Knowledge Discovery and Data Mining , Taiwan , 6–9 May. Lecture Notes in Artificial Intelligence 2336, 549–554 .
- Gehrke , J. , W.-Y. Loh , and R. Ramakrishnan . 1999 . Classification and regression: Money can grow on trees . Tutorial notes of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , San Diego , CA, USA , 1 – 73 .
- Kalousis , A. and M. Hilario . 2000 . Supervised knowledge discovery from incomplete data . In Proceedings of the 2nd International Conference on Data Mining 2000 , Cambridge , UK : WIT Press .
- Kim , J.-O. and J. Curry . 1977 . The treatment of missing data in multivariate analysis . Sociological Methods and Research 6 : 215 – 240 .
- Kirk , R. E. 1982 . Experimental Design , 2nd ed , Monterey , CA : Brooks Cole Publishing Company .
- Lakshminarayan , K. , S. A. Harp , and T. Samad . 1999 . Imputation of missing data in industrial databases . Applied Intelligence 11 : 259 – 275 .
- Little , R. J. A. and D. B. Rubin 1987 . Statistical Analysis with Missing Data . New York : Wiley .
- Lobo , O. O. and M. Numao . 1999 . Ordered estimation of missing values . In Proc. of the 3rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, Lecture Notes in Computer Science , 1574 , 274 – 278 .
- Lobo , O. O. and M. Numao . 2000 . On the applicability of a machine learning method for estimating missing values . International Machine Learning Conference 2000 , Palo Alto , CA .
- Loh , W.-Y. and N. Vanichsetakul . 1988 . Tree-structured classification via generalised discriminant analysis . Journal of the American Statistical Association 83 : 715 – 728 .
- MINITAB . 2002 . MINITAB Statistical Software for Windows 9.0 . MINITAB, Inc., State College , PA 16801-3008, USA .
- Myrtveit , I. , E. Stensrud , and U. Olsson . 2001 . Analyzing data sets with missing data: An empirical evaluation of imputation methods and likelihood-based methods . IEEE Transactions on Software Engineering 27 ( 11 ): 999 – 1013 .
- Pyle , D. 1999 . Data Preparation for Data Mining . San Francisco : Morgan Kauffman .
- Quinlan , J. R. 1987 . Simplifying decision trees . International Journal of Man – Machine Studies 27 : 221 – 234 .
- Quinlan , J. R. 1993 . C.4.5: Programs for Machine Learning . Los Altos , CA : Morgan Kauffman Publishers .
- Robins , D. B. and N. Wang . 2000 . Inference for imputation estimators . Biometrika 87 : 113 – 124 .
- Roth , P. L. 1994 . Missing data: A conceptual overview for applied psychologists . Personnel Psychology 47 : 537 – 560 .
- Rubin , D. B. 1987 . Multiple Imputation for Nonresponse in Surveys . New York : John Wiley and Sons .
- Rubin , D. B. 1996 . Multiple imputation after 18 + years . Journal of the American Statistical Association 91 : 473 – 489 .
- Schafer , J. L. 1997 . Analysis of Incomplete Multivariate Data . London : Chapman and Hall .
- Schafer , J. L. and M. K. Olsen . 1998 . Multiple imputation for multivariate missing data problems: A data analyst's perspective . Multivariate Behavioral Research 33 ( 4 ): 545 – 571 .
- Schafer , J. L. and J. W. Graham . 2002 . Missing data: Our view of the state of the art . Psychological Methods 7 ( 2 ): 147 – 177 .
- Sentas , P. , A. Lefteris , and I. Stamelos . 2004. Multiple logistic regression as imputation method applied on software effort prediction. In Proceedings of the 10th International Symposium on Software Metrics, Chicago, 14–16 September.
- Shapiro , A. 1987 . Structured Induction in Expert Systems . London : Addison Wesley .
- Song , Q. and M. Shepperd . 2004 . A short note on safest default missingness mechanism assumptions . Empirical Software Engineering 10 ( 2 ): 235 – 243 .
- S-PLUS . 2003 . S-PLUS 6.2 for Windows . MathSoft, Inc. , Seattle , WA .
- Tanner , M. A. and W. H. Wong . 1987 . The calculation of posterior distributions by data augmentation . Journal of the American Statistical Association 82 : 528 – 550 .
- Therneau , T. M. and E. J. Atkinson . 1997 . An introduction to recursive partitioning using the RPART routines . Technical Report , Mayo Foundation. Department of Statistics, Stanford University , USA .
- Twala , B. 2005 . Effective techniques for handling incomplete data using decision trees . Unpublished PhD thesis , Open University , Milton Keynes , UK .
- Twala , B. , M. Cartwright , and M. Shepperd . 2005 . Comparison of various methods for handling incomplete data in software engineering databases . In 4th International Symposium on Empirical Software Engineering , Noosa Heads , Australia , November .
- Twala , B. , M. C. Jones , and D. J. Hand . 2008 . Good methods for coping with missing data in decision trees . Pattern Recognition Letters 29 : 950 – 956 .
- Venables , W. N. and B. D. Ripley . 1994 . Modern Applied Statistics with S-PLUS . New York : Springer .
- Wu , C. F. J. 1983 . On the convergence of the EM algorithm . The Annals of Statistics 11 : 95 – 103 .