72
Views
0
CrossRef citations to date
0
Altmetric
Research Article

On statistical classification with incomplete covariates via filtering

& ORCID Icon
Pages 1342-1365 | Received 18 Feb 2020, Accepted 23 Nov 2020, Published online: 08 Dec 2020
 

Abstract

This article deals with the problem of classification when some of the covariates may have missing parts. Here, it is allowed for both the training sample as well as the new unclassified observation to have missing parts in the covariates. In fact, it is shown in Remark 3.3 that in classification the reconstruction/imputation of the missing part of a new unclassified observation (which is to be classified) can be counter-productive in terms of the error rates. Furthermore, unlike many of the results in the literature, where covariate fragments are usually assumed to be missing completely at random, we do not impose such assumptions here. Given the observed parts of the covariates, we construct a kernel-type classifier which is straightforward to implement. The proposed classifier is constructed based on d-dim covariate vectors that are obtained from the original covariates (by moving from the space L2 to 2), where d(<) itself is a parameter that has to be estimated. To estimate various parameters, we employ an easy-to-implement data-splitting approach.

2010 Mathematics Subject Classifications:

Acknowledgments

This work was supported by the NSF under Grant DMS-1916161 of Majid Mojirsheibani.

Data availability statement

The Share Price Increase data set used in Section 4.2, and a description of it, is available at http://www.timeseriesclassification.com/dataset.php

Additionally, a copy of the ‘R’ codes used to carry out the analysis in Section 4.2 is posted on the GitHub repository at https://github.com/mynhinguyen/Statistical-classification-with-incomplete-covariates-via-filtering

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by the National Science Foundation (NSF) under Grant DMS-1916161 of Majid Mojirsheibani.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.