Abstract
The decision‐tree (DT) algorithm is a very popular and efficient data‐mining technique. It is non‐parametric and computationally fast. Besides forming interpretable classification rules, it can select features on its own. In this article, the feature‐selection ability of DT and the impacts of feature‐selection/extraction on DT with different training sample sizes were studied by using AVIRIS hyperspcetral data. DT was compared with three other feature‐selection methods; the results indicated that DT was an unstable feature selector, and the number of features selected by DT was strongly related to the sample size. Trees derived with and without feature‐selection/extraction were compared. It was demonstrated that the impacts of feature selection on DT were shown mainly as a significant increase in the number of tree nodes (14.13–23.81%) and moderate increase in tree accuracy (3.5–4.8%). Feature extraction, like Non‐parametric Weighted Feature Extraction (NWFE) and Decision Boundary Feature Extraction (DBFE), could enhance tree accuracy more obviously (4.78–6.15%) and meanwhile a decrease in the number of tree nodes (6.89–16.81%). When the training sample size was small, feature‐selection/extraction could increase the accuracy more dramatically (6.90–15.66%) without increasing tree nodes.
Acknowledgements
We are grateful to Daniel Sui of TAMU for assistance and helpful advice in our research. We would also like to thank the anonymous reviewers for their constructive comments.