Incremental Forward Feature Selection with Application to Microarray Gene Expression Data

Yuh-Jye Lee Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, TaiwanCorrespondence[email protected]

Chien-Chung Chang Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan

Chia-Huang Chao Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan

Abstract

In this study, the authors propose a new feature selection scheme, the incremental forward feature selection, which is inspired by incremental reduced support vector machines. In their method, a new feature is added into the current selected feature subset if it will bring in the most extra information. This information is measured by using the distance between the new feature vector and the column space spanned by current feature subset. The incremental forward feature selection scheme can exclude highly linear correlated features that provide redundant information and might degrade the efficiency of learning algorithms. The method is compared with the weight score approach and the 1-norm support vector machine on two well-known microarray gene expression data sets, the acute leukemia and colon cancer data sets. These two data sets have a very few observations but huge number of genes. The linear smooth support vector machine was applied to the feature subsets selected by these three schemes respectively and obtained a slightly better classification results in the 1-norm support vector machine and incremental forward feature selection. Finally, the authors claim that the rest of genes still contain some useful information. The previous selected features are iteratively removed from the data sets and the feature selection and classification steps are repeated for four rounds. The results show that there are many distinct feature subsets that can provide enough information for classification tasks in these two microarray gene expression data sets.

Key Words:

Notes

Golub = Golub et al., Citation1999; Weston (2001) = Weston et al., Citation2001; Guyon = Guyon et al., Citation2002; Zhu = Zhu et al., Citation2004; N/A = denote not available results.

Weston (2001) = Weston et al., Citation2001; Guyon = Guyon et al., Citation2002; Weston (2003) = Weston et al., Citation2003.

Round 1 = select genes from the original data set; Round 2 = select genes from the remaining genes of Round 1; Round 3 = select genes from the remaining genes of Round 2; Round 4 = select genes from the remaining genes of Round 3.

Golub , T. , Slonim , D. , Tamayo , P. , Huard , C. , Gaasenbeek , M. , Mesirov , J. , Coller , H. , Loh , M. , Downing , J. , Caligiuri , M. , Bloomfield , C. , Lander , E. ( 1999 ). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring . Science 286 : 531 – 537 .

PubMed Web of Science ®Google Scholar

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Order Reprints Request Corporate Permissions

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

Request Academic Permissions

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.

Incremental Forward Feature Selection with Application to Microarray Gene Expression Data

Related Research Data

Information for

Open access

Opportunities

Help and information

Incremental Forward Feature Selection with Application to Microarray Gene Expression Data

Abstract

Notes

Reprints and Corporate Permissions

Academic Permissions

Related Research Data

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature