3,087
Views
7
CrossRef citations to date
0
Altmetric
Article; Bioinformatics

Quantile normalization for combining gene-expression datasets

&
Pages 751-758 | Received 24 Jan 2017, Accepted 16 Dec 2017, Published online: 09 Jan 2018

References

  • Golub TR, Slonim DK, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537.
  • Ross DT, Scherf U, Eisen MB, et al. Systematic variation in gene expression patterns in human cancer cell lines. Nat Genet. 2000;24:227–234.
  • Autio R, Kilpinen S, Saarela M, et al. Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations. BMC Bioinformatics. 2009;10(Suppl. 1):S24.
  • Xu L, Tan AC, Winslow RL, et al. Merging microarray data from separate breast cancer studies provides a robust prognostic test. BMC Bioinformatics. 2008;9:125.
  • van Vliet MH, Reyal F, Horlings HM, et al. Pooling breast cancer datasets has a synergetic effect on classification performance and improves signature stability. BMC Genomics. 2008;9:375.
  • Xu L, Tan AC, Naiman DQ, et al. Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data. Bioinformatics. 2005;21:3905–3911.
  • Kim K-Y, Ki D, Jeung H-C, et al. Improving the prediction accuracy in classification using the combined data sets by ranks of gene expressions. BMC Bioinformatics. 2008;9:283.
  • Kim S-Y. Effects of sample size on robustness and prediction accuracy of a prognostic gene signature. BMC Bioinformatics. 2009;10:147.
  • Yasrebi H, Sperisen P, Praz V, et al. Can survival prediction be improved by merging gene expression data sets? PLoS ONE. 2009;4(10):e7431.
  • Bevilacqua V, Pannarale P, Abbrescia M, et al. Comparison of data-merging methods with SVM attribute selection and classification in breast cancer gene expression. BMC Bioinformatics. 2012;13(Suppl. 7):S9.
  • Ma S, Sung J, Magis AT, et al. Measuring the effect of inter-study variability on estimating prediction error. PLoS ONE. 2014;9(10):e110840.
  • Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118–127.
  • Leek JT, Storey JD. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 2007;3:e161.
  • Gagnon-Bartsch JA, Speed TP. Using control genes to correct for unwanted variation in microarray data. Biostatistics. 2012;13:539–552.
  • Jacob L, Gagnon-Bartsch JA, Speed TP. Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed. Biostatistics. 2016;17(1):16–28.
  • Hicks SC, Irizarry RA. Quantro: a data-driven approach to guide the choice of an appropriate normalization method. Genome Biol. 2015;16(1):117.
  • Irizarry RA, Hobbs B, Collin F, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003;4:249–264.
  • Kim KI, Simon R. Probabilistic classifiers with high-dimensional data. Biostatistics. 2011;12(3):399–412.
  • Gene Expression Omnibus. Available from: http://www.ncbi.nlm.nih.gov/geo/browse/
  • Wang Y, Klijn JG, Zhang Y, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365(9460):671–679.
  • Ivshina AV, George J, Senko O, et al. Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. Cancer Res. 2006;66(21):10292–10301.
  • Desmedt C, Piette F, Loi S, et al. Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the transbig multicenter independent validation series. Clin Cancer Res. 2007;13:3207–3214.
  • Shi L, Campbell G, Jones WD, et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol. 2010;28(8):827–838.
  • Hatzis C, Pusztai L, Valero V, et al. A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer. J Am Med Assoc. 2011;305:1873–1881.
  • Nagalla S, Chou JW, Willingham MC, et al. Interactions between immunity, proliferation and molecular subtype in breast cancer prognosis. Genome Biol. 2013;14:R34.
  • Jong VL, Novianti PW, Roes KC, et al. Exploring homogeneity of correlation structures of gene expression datasets within and between etiological disease categories. Stat Appl Genet Mol Biol. 2014;13(6):717–732.
  • Novianti PW, Jong VL, Roes KC, et al. Factors affecting the accuracy of a class prediction model in gene expression data. BMC Bioinformatics. 2015;16:199.
  • Statnikov A, Aliferis CF, Tsamardinos I, et al. A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis. Bioinformatics. 2005;21(5):631–643.
  • Durrant R, Kabán A. Random projections as regularizers: learning a linear discriminant from fewer observations than dimensions. Mach Learn. 2015;99:257–286.
  • Xie S, Li P, Jiang Y, et al. A discriminative method for protein remote homology detection based on N-Gram. Genet Mol Res. 2015;14(1):69–78.
  • Tripathy A, Agrawal A, Kumar Rath SK. Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl. 2016;57:117–126.
  • Pan M, Zhang J. Correlation-based linear discriminant classification for gene expression data. Genet Mol Res. 2017;16(1): gmr16019357.
  • Allahyar A, de Ridder J. FERAL: network-based classifier with application to breast cancer outcome prediction. Bioinformatics. 2015;31:i311–i319.