137
Views
8
CrossRef citations to date
0
Altmetric
Original Articles

Identifying High-Dimensional Biomarkers for Personalized Medicine via Variable Importance Ranking

, , , , &
Pages 853-868 | Received 16 Oct 2007, Accepted 26 Feb 2008, Published online: 10 Sep 2008
 

Abstract

We apply robust classification algorithms to high-dimensional genomic data to find biomarkers, by analyzing variable importance, that enable a better diagnosis of disease, an earlier intervention, or a more effective assignment of therapies. The goal is to use variable importance ranking to isolate a set of important genes that can be used to classify life-threatening diseases with respect to prognosis or type to maximize efficacy or minimize toxicity in personalized treatment of such diseases. A ranking method and present several other methods to select a set of important genes to use as genomic biomarkers is proposed, and the performance of the selection procedures in patient classification by cross-validation is evaluated. The various selection algorithms are applied to published high-dimensional genomic data sets using several well-known classification methods. For each data set, a set of genes selected on the basis of variable importance that performed the best in classification is reported. That classification algorithm with the proposed ranking method is shown to be competitive with other selection methods for discovering genomic biomarkers underlying both adverse and efficacious outcomes for improving individualized treatment of patients for life-threatening diseases.

ACKNOWLEDGMENTS

Hojin Moon's research was partially supported by the Scholarly and Creative Activities Committee (SCAC) Award from CSULB. Hongshik Ahn's research was partially supported by the Faculty Research Participation Program at the NCTR administered by the Oak Ridge Institute for Science and Education through an interagency agreement between USDOE and USFDA.

Notes

1 k∗ = 1 with CERPWFM, CERPMDI, RFMDA, RFMDI, SVMRFE, BW; k∗ = 3 with the t-test. Values in boldface indicate lymphoma data.

1 k∗ = 1 with SVMRFE; k = 3∗ with CERPWFM, CERPMDI, RFMDA, RFMDI; k∗ = 5 with BW, the t-test.

Values in boldface indicate pediatric AML data.

Note: Since the selected genes from the t-test and the BW ratio are the same, only the t-test is reported. DLDA classification algorithm is used for illustration. PPV and NPV stand for positive and negative predictive values, respectively.

T = Set of genes selected by the t-test; C = set of genes selected by CERP; T ∩ C = common set of genes selected by the t-test and CERP; T ∪ C = combined set of genes selected by the t-test or CERP; (T − C) ∪ (C − T) = combined mutually exclusive set of genes selected by the t-test or CERP.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 717.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.