104
Views
14
CrossRef citations to date
0
Altmetric
Original Articles

Incremental Forward Feature Selection with Application to Microarray Gene Expression Data

, &
Pages 827-840 | Received 17 Jul 2007, Accepted 05 Mar 2008, Published online: 10 Sep 2008
 

Abstract

In this study, the authors propose a new feature selection scheme, the incremental forward feature selection, which is inspired by incremental reduced support vector machines. In their method, a new feature is added into the current selected feature subset if it will bring in the most extra information. This information is measured by using the distance between the new feature vector and the column space spanned by current feature subset. The incremental forward feature selection scheme can exclude highly linear correlated features that provide redundant information and might degrade the efficiency of learning algorithms. The method is compared with the weight score approach and the 1-norm support vector machine on two well-known microarray gene expression data sets, the acute leukemia and colon cancer data sets. These two data sets have a very few observations but huge number of genes. The linear smooth support vector machine was applied to the feature subsets selected by these three schemes respectively and obtained a slightly better classification results in the 1-norm support vector machine and incremental forward feature selection. Finally, the authors claim that the rest of genes still contain some useful information. The previous selected features are iteratively removed from the data sets and the feature selection and classification steps are repeated for four rounds. The results show that there are many distinct feature subsets that can provide enough information for classification tasks in these two microarray gene expression data sets.

Notes

Golub = Golub et al., Citation1999; Weston (2001) = Weston et al., Citation2001; Guyon = Guyon et al., Citation2002; Zhu = Zhu et al., Citation2004; N/A = denote not available results.

Weston (2001) = Weston et al., Citation2001; Guyon = Guyon et al., Citation2002; Weston (2003) = Weston et al., Citation2003.

Round 1 = select genes from the original data set; Round 2 = select genes from the remaining genes of Round 1; Round 3 = select genes from the remaining genes of Round 2; Round 4 = select genes from the remaining genes of Round 3.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 717.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.