149
Views
169
CrossRef citations to date
0
Altmetric
Theory and Method

Tree-Structured Classification via Generalized Discriminant Analysis

&
Pages 715-725 | Received 01 Jan 1986, Published online: 12 Mar 2012
 

Abstract

The problem of constructing classification rules that can be represented as decision trees is considered. Each object to be classified has an associated x vector containing possibly incomplete covariate information. Tree construction is based on the information provided in a “learning sample” of objects with known class identities. The x vectors in the learning sample may have missing values as well. Procedures are proposed for each of the components of classifier construction, such as split selection, tree-size determination, treatment of missing values, and ranking of variables. The main idea is recursive application of linear discriminant analysis, with the variables at each stage being appropriately chosen according to the data and the type of splits desired. Standard statistical techniques used as basic building blocks include analysis of variance, linear and canonical discriminant analysis, and principal component analysis. A new method of tree-structured classification is obtained by assembling the pieces. This method can accommodate prior probabilities as well as unequal misclassification costs and can yield trees with univariate, linear combination, or linear combination with polar coordinate splits. The method is compared with the CART method of Breiman, Friedman, Olshen, and Stone (1984). Some of the operational differences are that the new method (a) can have multiple splits per node, (b) is nonrandomized, (c) uses a direct stopping rule, (d) handles missing values by estimation, (e) allows both ordered and unordered variables in the same linear combination split, (f) is not invariant of monotone transformations of the individual variables, and (g) is computationally faster. Simulation experiments suggest that the two methods have comparable classification accuracy. The Boston housing data are analyzed in a classification context for illustration.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.