Publication Cover
Journal of Quality Technology
A Quarterly Journal of Methods, Applications and Related Topics
Volume 48, 2016 - Issue 3
100
Views
11
CrossRef citations to date
0
Altmetric
Case Studies

A Study of Missing Data Imputation in Predictive Modeling of a Wood-Composite Manufacturing Process

, , , , &

References

  • Agarwal, S. (2001). “Learning from Incomplete Data”. Department of Computer Science and Engineering, University of California, San Diego. http://cseweb.ucsd.edu/~elkan/254spring01/sagarwalrep.pdf.
  • André, N.; Cho, H. W.; Baek, S. H.; Jeong, M. K.; and Young, T. M. (2008). “Enhanced Prediction of Internal Bond Strength in A Medium Density Fiberboard Process Using Multivariate Methods and Variable Selection”. Wood Science and Technology 42, pp. 521–534.
  • André, N. and Young, T. M. (2013). “Real-Time Process Modeling of Particleboard Manufacture Using Variable Selection and Regression Methods Ensemble”. European Journal of Wood and Wood Products (Holz als Roh- und Werk-stoff) 71 (3), pp. 361–370.
  • Allison, P. (2000). “Multiple Imputation for Missing Data: A Cautionary Tale”. Sociological Methods and Research 28, pp. 301–309.
  • Altmayer, L. (2002). “Hot-Deck Imputation: A Simple DATA Step Approach”. Proceedings of the 2002 Northeast SAS User's Group, pp. 773–780. Buffalo, NY: Northeast SAS User's Group.
  • Barnes, D. (2001). “A Model of the Effect of Strand Length and Strand Thickness on the Strength Properties of Oriented Wood Composites”. Forest Product Journal 51 (9), pp. 36–46.
  • Belsley, D. A.; Kuh, E.; and Welsch, R. E. (1980). Regression Diagnostics. Hoboken, NJ: John Wiley # Sons.
  • Chipman, H. A.; George, E. I.; and McCulloch, R. E. (2010). “BART: Bayesian Additive Regression Trees”. Annals of Applied Statistics 4, pp. 266–298.
  • Clapp, Jr., N. E.; Young, T. M.; and Guess, F. M. (2008). “Predictive Modeling the Internal Bond of Medium Density Fiberboard Using a Modified Principal Component Analysis”. Forest Products Journal 58 (4), pp. 49–55.
  • Collins, L. M.; Schafer, J. L.; and Kam, C. M. (2001). “A Comparison of Inclusive and Restrictive Strategies in Modern Missing-Data Procedures”. Psychological Methods 6, pp. 330–351.
  • Dempster, A. P.; Laird, N. M.; and Rubin, D. B. (1977). “Maximum Likelihood from Incomplete Data via the EM Algorithm”. Journal of the Royal Statistical Society, Series B 39 (1) pp. 1–38.
  • Draper, D. and Fouskakis, D. (2000). “A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results”. Journal of Global Optimization 18, pp. 399–416.
  • Datta, S.; Le-Rademacher, J.; and Datta, S. (2007). “Predicting Patient Survival from Microarray Data by Accelerated Failure Time Modeling Using Partial Least Squares and LASSO”. Biometrics 63, 1, pp. 259–271.
  • Efron, B.; Hastie, T.; Johnstone, I.; and Tibshirani, R. (2004). “A New Method for Variable Subset Selection, with The Lasso and ‘Epsilon’ Forward Stagewise Methods as Special Cases. LARS Software for R And Splus”. Annals of Statistics (with discussion 32 (2), pp. 407–499.
  • Enders, C. K. (2001). “The Impact of Nonnormality on Full Information Maximum-Likelihood Estimation for Structural Equation Models with Missing Data”. Psychological Methods 6, pp. 352–370.
  • Faraway, J. J. (2005). Linear Models with R. New York, NY: Chapman # Hall.
  • Fetter, M. (2001). “Mass Imputation of Agricultural Economic Data Missing by Design: A Simulation Study of Two Regression Based Techniques”. Federal Conference on Survey Methodology. http://www.fcsm.gov/01papers/Fetter.pdf.
  • Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge, UK: Cambridge University Press.
  • Gonńalez, I. and Sánchez, I. (2010). “Variable Selection in Multivariate Statistical Process Control”. Journal of Quality Technology 42 (3), pp. 242–259.
  • Guyon, I. and Elisseeff, A. (2003). “An Introduction to Variable and Feature Selection”. Journal of Machine Learning Research 3, pp. 1157–1182.
  • Hamer, R. M. (2009). “Last Observation Carried Forward Versus Mixed Models in the Analysis of Psychiatric Clinical Trials”. American Journal of Psychiatry 166, pp. 639–641.
  • Hastie, T.; Tibshirani, R.; and Friedman, J. (2009). Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed. New York, NY: Springer-Verlag.
  • Hastie T. and Efron, B. (2011). Package ‘lars’. Available at http://cran.r-project/web/packages/lars/lars.pdf.
  • James, G.; Witten, D.; Hastie, T.; and Tibshirani, R. (2013). An Introduction to Statistical Learnings with Applications in R. New York, NY: Springer.
  • Kourti, T.; Lee, J.; and MacGregor, J. F. (1996). “Experiences with Industrial Applications of Projection Methods for Multivariate Statistical Process Control”. Computers in Chemical Engineering 20, pp. 745–750.
  • Lanning, D. and Berry, D. (2003). “An Alternative to PROC MI for Large Samples”. SAS Users Group International (SUGI) 28, Seattle, WA.
  • Lin, T. H. (2010). “A Comparison Of Multiple Imputation with EM Algorithm and MCMC Method for Quality of Life Missing Data”. Qual Quant 44, pp. 277–287.
  • Little, R. J. A. (1992). “Regression with Missing X's: A Review”. Journal of the American Statistical Association 87, pp. 1227–1237.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, 2nd ed. New York, NY: John Wiley.
  • Mevik, B. H. and Wehrens, R. (2007). “The pls Package: Principal Component and Partial Least Squares Regression in R”. Journal of Statistical Software 18 (2), pp. 1–24.
  • McGee, M. and Bergasa, N. V. (2005). “Imputing Missing Data in Clinical Pilot Studies”. Technical Report. Dallas, TX: Southern Methodist University.
  • Newman, D. A. (2003). “Longitudinal Modeling with Randomly and Systematically Missing Data: A Simulation of Ad Hoc, Maximum Likelihood, and Multiple Imputation Techniques”. Organizational Research Methods 6 (3), pp. 328–362.
  • Ni, D.; Leonard, II, J. D.; Guin, A.; and Feng, C. (2005). “Multiple Imputation Scheme for Overcoming the Missing Values and Variability Issues in ITS Data”. Journal of Transportation Engineering 131 (12), pp. 931–938.
  • Parsons, L.; Haque, E.; and Liu, H. (2004). “Evaluating Subspace Clustering Algorithms”. SIAM International Conference on Data Mining, pp. 48–56.
  • Rubin, D. B. (1996). “Multiple Imputation after 18+ Years (with Discussion)”. Journal of the American Statistical Association 91, pp. 473–489.
  • Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. London, UK: Chapman # Hall.
  • Schafer, J. L. and Olsen, M. K. (1998). “Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst's Perspective”. Multivariate Behavioral Research 33 (4), pp. 545–571.
  • Soh, C. S.; Ong, K. M.; and Raveendran, P. (2005). “Variable Selection Using Genetic Algorithm for Analysis of Near-Infrared Spectral Data Using Partial Least Squares”. Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China.
  • Soullier, N.; Rochebrochard, E.; and Bouyer, J. (2010). “Multiple Imputation for Estimation of an Occurrence Rate in Cohorts with Attrition and Discrete Follow-Up Time Points: A Simulation Study”. BMC Medical Research Methodology 10 (79), pp. 1–7.
  • Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”. Journal of the Royal Statistical Society 58 (1), pp. 267–288.
  • Truxillo, C. (2005). “Maximum Likelihood Parameter Estimation with Incomplete Data”. Proceedings of the Thirtieth Annual SAS® User Group International Conference.
  • UCLA: Academic Technology Services, Statistical Consulting Group. “Statistical Computing Seminars Missing Data in SAS Part 1”. http://www.ats.ucla.edu/stat/sas/seminars/missing_data/mi_new_1.htm. Accessed April 24, 2011.
  • UCLA: Academic Technology Services, Statistical Consulting Group. “Missing data in SAS” http://www.ats.ucla.edu/stat/sas/modules/missing.htm. Accessed April 24, 2011.
  • Wang, K. and Jiang, W. (2009). “High-Dimensional Process Monitoring and Fault Isolation via Variable Selection”. Journal of Quality Technology 41 (3), pp. 247–258.
  • Young, T. M.; André, N.; and Huber, C. W. (2004). “Predictive Modeling of The Internal Bond of MDF Using Genetic Algorithms with Distributed Data Fusion”. Proceedings of the Eighth European Panel Products Symposium, pp. 45–59.
  • Yuan, Y. C. (2010). Multiple Imputation for Missing Data: Concepts and New Development (Version 9.0), pp. 49. Rockville, MD: SAS Institute, Inc.
  • Zeng, Y. (2011). “A Study of Missing Data Imputation and Predictive Modeling of Strength Properties of Wood Composites”. M.S. Thesis, 74 p. The University of Tennessee, Knoxville, TN.
  • Zhu, M. and Chipman, H. A. (2006). “Darwinian Evolution in Parallel Universes: A Parallel Genetic Algorithm for Variable Selection”. Technometrics 48 (4), pp. 491–502.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.