136
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Clustering Microarray Data: Theoretical and Practical Issues

&
Pages 3211-3232 | Received 02 Feb 2011, Accepted 28 Dec 2011, Published online: 25 Jul 2012

References

  • Alizadeh , A. A. , Eisen , M. B. , Davis , R. E. , Ma , C. , Lossos , I. S. , Rosenwald , A. , Boldrick , J. C. , Sabet , H. , Tran , C. , Powell , J. I. , Yang , L. , Marti , G. E. , Moore , T. , Hudsen , J. , Jr. Lu , L. , Lewis , D. B. , Tibshirani , R. , Sherlock , G. , Chan , W. C. , Greiner , T. C. , Weisenberger , D. D. , Armitage , J. O. , Warnke , R. , Levy , R. , Wilson , W. , Grever , M. R. , Byrd , J. C. , Botstein , D. , Brown , P. O. , Staudt , L. M. ( 2000 ). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling . Nature 403 : 503 – 511 .
  • Bittner , M. , Meltzer , P. , Chen , Y. , Jiang , Y. , Seftor , E. , Hendrix , M. , Radmacher , M. , Simon , R. , Yakhini , Z. , Ben-Dor , A. , Sampas , N. , Dougherty , E. , Wang , E. , Marincola , F. , Gooden , C. , Lueders , J. , Glatfelter , A. , Pollock , P. , Carpten , J. , Gillanders , E. , Leja , D. , Dietrich , K. , Beaudry , C. , Berens , M. , Alberts , D. , Sondak , V. , Hayward N., and Trent , J. (2000). Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406:536–540.
  • Bolshakova , N. , Azuaje , F. , Cunningham , P. ( 2005 ). A knowledge-driven approach to cluster validity assessment . Bioinformatics 21 ( 10 ): 2546 – 7 .
  • Calinski , R. B. , Harabasz , J. ( 1974 ). A dendrite method for cluster analysis . Commun. Statist. 3 ( 1 ): 1 – 27 .
  • Causton , H. C. , Quackenbush , J. , Brazma , A. ( 2003 ). Microarray Gene Expression Data Analysis: A Beginner's Guide . Malden , MA : Blackwell Publishing .
  • Cheng , Y. , Church , G. M. ( 2000 ). Biclustering of expression data. Proc. 8th Int. Conf. Intell. Syst. Molec. Biol.(ISMB '00), San Diego, CA, pp. 93–103 .
  • Chipman , H. , Tibshirani , R. ( 2006 ). Hybrid hierarchical clustering with applications to microarray data . Biostatistics 7 ( 2 ): 286 – 301 .
  • Cho , R. J. , Campbell , M. J. , Winzeler , E. A. , Steinmetz , L. , Conway , A. , Wodicka , L. , Wolfsberg , T. G. , Gabrielian , A. E. , Landsman , D. , Lockhart , D. J. , Davis , R. W. ( 1998 ). A genome wide transcriptional analysis of the mitotic cell cycle . Molec. Cell 2 ( 1 ): 65 – 73 .
  • DeRisi , J. , Iyer , V. R. , Brown , P. O. ( 1997 ). Exploring the metabolic and genetic control of gene expression on a genomic scale . Science 278 : 680 – 686 .
  • Di Lascio , F. M. L. ( 2008 ). Analyzing the dependence structure of microarray data: A copula–based approach. Ph.D. Dissertation, University of Bologna, Bologna, Italy. Available at http://amsdottorato. cib.unibo.it/670/ [22 July 2008] .
  • Di Lascio , F. M. L. , Giannerini , S. ( 2012 ). A Copula–Based Algorithm for Discovering Patterns of Dependent Observations . J. Classific. 29 ( 1 ): 50 – 75 .
  • Dougherty , E. R. , Barrera , J. , Brun , M. , Kim , S. , Cesar , R. M. , Chen , Y. , Bittner , M. , Trent , J. M. ( 2002 ). Inference from clustering with application to gene–expression microarrays . J. Computat. Biol. 9 ( 1 ): 105 – 126 .
  • Duda , R. O. , Hart , P. E. ( 1973 ). Pattern Classification and Scene Analysis . New York : Wiley .
  • Dudoit , S. , Fridlyand , J. ( 2002 ). A prediction based resampling method for estimating the number of clusters in a data set. Genome Biology 3(7):0036.1–0036.21 .
  • Eisen , M. B. , Spellman , P. T. , Brown , P. O. , Botstein , D. ( 1998 ). Cluster analysis and display of genome–wide expression patterns . Proc. Nat. Acade. Sci. United States of Amer. (PNAS) 95 ( 25 ): 14863 – 14868 .
  • Everitt , B. S. , Landau , S. , Leese , M. ( 2001 ). Cluster Analysis. , 4th ed. London : Hodder Arnold .
  • Fraley , C. , Raftery , A. E. ( 1998 ). How many clusters? Which clustering method? Answers via model–based cluster analysis . Comput. J. 41 ( 8 ): 578 – 588 .
  • Fritzke , B. ( 1994 ). Growing cell structures – a self-organizing network for unsupervised and supervised learning . Neur. Netw. 7 : 1141 – 1160 .
  • Golub , T. R. , Slonim , D. K. , Tamayo , P. , Huard , C. , Gaasenbeek , M. , Mesirov , J. P. , Coller , H. , Loh , M. L. , Caligiuri , M. A. , Bloomfield , C. D. , Lander , E. S. ( 1999 ). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring . Science 286 ( 5439 ): 531 – 537 .
  • Hair , J. F. , Black , W. C. ( 2000 ). Cluster analysis . In: Grimm , L. M. , Yarnold , P. R. eds. Reading and Understanding More Multivariate Statistics . Washington , DC : APA .
  • Hartigan , J. A. ( 1972 ). Direct clustering of a data matrix . J. Amer. Statist. Assoc. (JASA) 67 ( 337 ): 123 – 129 .
  • Hartigan , J. A. ( 1975 ). Clustering Algorithms . New York : Wiley .
  • Hartigan , J. A. , Wong , M. A. ( 1979 ). Algorithm AS136: A K-means clustering algorithm . Appl. Statist. 28 ( 1 ): 100 – 108 .
  • Hastie , T. , Tibshirani , R. , Eisen , M. B. , Alizadeh , A. A. , Levy , R. , Staudt , L. , Chan , W. C. , Botstein , D. , Brown , P. O. (2000). ‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 1:0003.1–0003.21.
  • Hedenfalk , I. , Duggan , D. , Chen , Y. , Radmacher , M. , Bittner , M. , Simon , R. , Meltzer , P. , Gusterson , B. , Esteller , M. , Kallioniemi , O. P. , Wilfond , B. , Borg , A. , Dougherty , E. , Kononen , J. , Bubendorf , L. , Fehrle , W. , Pittaluga , S. , Gruvberger , S. , Loman , N. , Johannsson , O. , Olsson , H. , Sauter , G. ( 2001 ). Gene–expression profiles in hereditary breast cancer . New Engl. J. Med. 344 ( 8 ): 539 – 548 .
  • Herrero , J. , Valencia , A. , Dopazo , J. ( 2001 ). A hierarchical unsupervised growing neural network for clustering gene expression patterns . Bioinformatics 17 ( 2 ): 126 – 136 .
  • Kaufman , L. , Rousseeuw , P. ( 1990 ). Finding Groups in Data: An Introduction to Cluster Analysis . New York : Wiley .
  • Kohonen , T. ( 1990 ). The self-organizing map . Proc. IEEE 78 ( 9 ): 1464 – 1479 .
  • MacQueen , J. ( 1967 ). Some methods for classification and analysis of multivariate observations . Proc. 4th Berkeley Symp. Mathemat. Statist. Probab. 1 : 281 – 297 .
  • Madeira , S. C. , Oliveira , A. L. ( 2004 ). Biclustering algorithms for biological data analysis: A survey . IEEE Trans. Computat. Biol. Bioinform. 1 : 24 – 45 .
  • McLachlan , G. J. , Bean , R. W. , Peel , D. ( 2002 ). A mixture model–based approach to the clustering of microarray expression data . Bioinformatics 18 : 413 – 422 .
  • Milligan , G. W. , Cooper , M. C. ( 1985 ). An examination of procedures for determining the number of clusters in a data set . Psychometrika 50 : 159 – 179 .
  • Pan , W. , Lin , J. , Le , T. C. ( 2002 ). Model-based cluster of analysis of microarray gene-expression data. Genome Biol. 3(2):0009.1–0009.8 .
  • Rencher , A. C. ( 1995 ). Methods of Multivariate Analysis . New York : Wiley .
  • Rocci , R. , Vichi , M. ( 2008 ). Two-mode multi-partitioning . Computat. Statist. Data Anal. 52 : 1984 – 2003 .
  • Rousseeuw , P. J. ( 1987 ). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis . J. Computat. Appl. Math. 20 : 53 – 65 .
  • Roverato , A. , Di Lascio , F. M. L. ( 2011 ). Wilks'Λ dissimilarity measures for gene clustering: An approach based on the identification of transcription modules . Biometrics 67 ( 4 ): 1236 – 1248 .
  • Schena , M. , Shalon , D. , David , R. W. , Brown , P. O. ( 1995 ). Quantitative monitoring of gene expression patterns with a complementary DNA microarray . Science 270 : 467 – 470 .
  • S\ootrlie, T., Perou , C. M. , Tibshirani , R. , Aas , T. , Geisler , S. , Johnsen , H. , Hastie , T. , Eisen , M. B. , van de Rijn , M. , Jeffrey , S. S. , Thorsen , T. , Quist , A. , Matese , J. C. , Brown , P. O. , Botstein , D. , Lønning , P. E. , Børresen–Dale , A.–L. ( 2001 ). Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications . Proc. Nat. Acad. Sci. United States of Amer.(PNAS) 98 ( 19 ): 10869 – 10874 .
  • Speed , T. ( 2003 ). Statistical Analysis of Gene Expression Microarray Data . London : Chapman and Hall .
  • Spellman , P. T. , Sherlock , G. , Zhang , M. Q. , Iyer , V. R. , Anders , K. , Eisen , M. B. , Brown , P. O. , Botstein , D. Futcher , B. ( 1998 ). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization . Molec. Biol. Cell 9 : 3273 – 3279 .
  • Tamayo , P. , Slonim , D. , Mesirov , J. , Zhu , Q. , Kitareewan , S. , Dmitrovsky , E. , Lander , E. S. , Golub , T. R. ( 1999 ). Interpreting patterns of gene expression with self–organizing maps: Methods and application to hematopoietic differentiation . Proc. Nat. Acad. Sci. United States of Amer.(PNAS) 96 : 2907 – 2912 .
  • Tavazoie , S. , Hughes , J. D. , Campbell , M. J. , Cho , R. J. , Church , G. M. ( 1999 ). Systematic determination of genetic network architecture . Nat. Genet. 22 ( 3 ): 281 – 285 .
  • Tibshirani , R. , Walther , G. , Hastie , T. ( 2001 ). Estimating the number of clusters in a data set via the gap statistic . J. Roy. Statist. Soc. B 63 ( 2 ): 411 – 423 .
  • Wit , E. McClure , J. D. ( 2004 ). Statistics for Microarrays: Design, Analysis, and Inference . New York : Wiley .

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.