130
Views
1
CrossRef citations to date
0
Altmetric
Original Articles

Finite-sample analysis of impacts of unlabeled data and their labeling mechanisms in linear discriminant analysis

&
Pages 184-203 | Received 20 Sep 2013, Accepted 20 Aug 2014, Published online: 21 Oct 2016

References

  • Airoldi, J.-P., Flury, B. D., Salvioni, M. (1995). Discrimination between two species of Microtus using both classified and unclassified observations. Journal of Theoretical Biology 177:247–262.
  • Amemiya, T. (1985). Advanced Econometrics. Harvard University Press, Cambridge: USA.
  • Andrews, D. F., Gnanadesikan, R., Warner, J. L. (1971). Transformation of multivariate data. Biometrics 27:825–840.
  • Biecek, P., Szczurek, E. (2012). bgmm: Gaussian Mixture Modeling algorithms. Including the Belief-Based Mixture Modeling R Package Version 1.5. Available at http://CRAN.R-project.org/package=bgmm.
  • Biecek, P., Szczurek, E., Vingron, M., Tiuryn, J. (2012). The R package bgmm: mixture modeling with uncertain knowledge. Journal of Statistical Software 47:1–32.
  • Boldea, O., Magnus, J. R. (2009). Maximum likelihood estimation of the multivariate normal mixture model. Journal of the American Statistical Association 104:1539–1549.
  • Box, G. E. P., Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, Series B 26:211–252.
  • Cai, D., He, X., Han, J. (2007). Semi-supervised discriminant analysis. In: IEEE 11th International Conference on Computer Vision, pp. 1–7.
  • Castelli, V., Cover, T. M. (1995). On the exponential value of labeled samples. Pattern Recognition Letters 16:105–111.
  • Castelli, V., Cover, T. M. (1996). The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Transactions on Information Theory 42:2102–2117.
  • Chapelle, O., Schölkopf, B., Zien, A. (2006). Semi-Supervised Learning. MIT Press, Cambridge: USA.
  • Cozman, F. G., Cohen, I., Cirelo, M. C. (2003). Semi-supervised learning of mixture models. In: Proceedings of the Twentieth International Conference on Machine Learning, The AAAI Press, Menlo Park: California. pp. 99–106.
  • Croux, C., Filzmoser, P., Joossens, K. (2008). Classification efficiencies for robust linear discriminant analysis. Statistica Sinica 18:581–599.
  • Day, N. E. (1969). Estimating the components of a mixture of normal distributions. Biometrika 56:463–474.
  • Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM-algorithm. Journal of the Royal Statistical Society, Series B 39:1–38.
  • Dillon, J. V., Balasubramanian, K., Lebanon, G. (2010). Asymptotic analysis of generative semi-supervised learning. In: Proceedings of the Twenty Seventh International Conference on Machine Learning, The AAAI Press, Menlo Park, California. pp. 295–302.
  • Druck, G., Pal, C., Zhu, X., McCallum, A. (2007). Semi-supervised classification with hybrid generative/discriminative methods. In: Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM New York, NY:USA. pp. 280–289.
  • Efron, B. (1975). The efficiency of logistic regression compared to normal discriminant analysis. Journal of the American Statistical Association 70:892–898.
  • Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics 7:179–188.
  • Guo, Y., Hastie, T., Tibshirani, R. (2007). Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8:86–100.
  • Kawakita, M., Kanamori, T. (2013). Semi-supervised learning with density-ratio estimation. Machine Learning 91:189–209.
  • Lafferty, J., Wasserman, L. (2007). Statistical analysis of semi-supervised regression. Advances in Neural Information Processing Systems 20:295–302.
  • Little, R. J. A., Rubin, D. B. (2002). Statistical Analysis with Missing Data. 2nd ed. John Wiley & Sons. Inc., Hoboken, New Jersey.
  • McLachlan, G. J. (1976). The bias of the apparent error rate in discriminant analysis. Biometrika 63:239–244.
  • McLachlan, G. J. (2004). Discriminant Analysis and Statistical Pattern Recognition. John Wiley & Sons. Inc., Hoboken, New Jersey.
  • McLachlan, G. J., Scot, D. (1995). Asymptotic relative efficiency of the linear discriminant function under partial nonrandom classification of the training data. Journal of Computation and Simulation 52:415–426.
  • Munõz-Pichardo, J. M., Enguix-González, A., Munõz-García, J., Moreno-Rebolloa, J. L. (2011). Influence analysis on discriminant coordinates. Communications in Statistics – Simulation and Computation 40:793–807.
  • Oba, S., Ishii, S. (2006). Semi-supervised discovery of differential genes. BMC Bioinformatics 7:1–13.
  • Okamoto, M. (1963). An asymptotic expansion for the distribution of the linear discriminant function. Annals of Mathematical Statistics 34:1286–1301. Correction (1968). Annals of Mathematical Statistics 39(1968):1358–1359.
  • O’Neill, T. J. (1978). Normal discrimination with unclassified observations. Journal of the American Statistical Association 73:821–826.
  • R Development Core Team., (2012). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available at http://www.R-project.org/.
  • Rigollet, P. (2007). Generalization error bounds in semi-supervised classification under the cluster assumption. Journal of Machine Learning Research 8:1369–1392.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika 63:581–592.
  • Shia, B.-C., Zhu, J., Fang, K., Ma, S. (2011). Fuzzy canonical discriminant analysis: Theory and practice. Communications in Statistics – Simulation and Computation 40:1526–1539.
  • Singh, A., Nowak, R. D., Zhu, X. (2008). Unlabeled data: Now it helps, now it doesn’t. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L., eds. Advances in Neural Information Processing Systems, Curran Associates, Inc, Red Hook: NY pp. 1513–1520.
  • Sokolovska, N., Cappé, O., Yvon, F. (2008). The asymptotics of semi-supervised learning in discriminative probabilistic models. In: Proceedings of the Twenty Fifth International Conference on Machine Learning, The AAAI Press, Menlo Park, California: USA pp. 984–991.
  • Sugiyama, M., Ide, T., Nakajima, S., Sese, J. (2010). Semi-supervised local Fisher discriminant analysis for dimensionality reduction. Machine Learning 73:35–61.
  • Takai, K., Hayashi, K. (2014). Effects of unlabeled data on classification error in normal discriminant analysis. Journal of Statistical Planning and Inference 147:66–83.
  • Takai, K., Kano, Y. (2013). Asymptotic inference with incomplete data. Communications in Statistics – Theory and Methods 42:2474–2490.
  • Zhu, X., Ghahramani, Z., Lafferty, J. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the Twentieth International Conference on Machine Learning, The AAAI Press, Menlo Park, California: USA pp. 912–919.
  • Zhu, X., Goldberg, A. B. (2009). Introduction to Semi-Supervised Learning. Morgan & Claypool Press, Carlifornia: USA.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.