75
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

How many clusters exist? Answer via maximum clustering similarity implemented in R

ORCID Icon, , &
Pages 62-79 | Received 07 Jul 2018, Accepted 20 Apr 2019, Published online: 22 May 2019

References

  • Everitt BS, Landau S, Leese M. Cluster analysis. New York: Oxford University Press; 2001.
  • Marriott FHC. Practical problems in a method of cluster analysis. Biometrics. 1971;27:501–514. doi: 10.2307/2528592
  • Hartigan JA. Clustering algorithms. New York: Wiley; 1975.
  • Bock HH. On some significance tests in cluster analysis. J Classif. 1985;2:77–108. doi: 10.1007/BF01908065
  • Hardy A. On the number of clusters. Comput Stat Data Anal. 1996;23:83–96. doi: 10.1016/S0167-9473(96)00022-9
  • Gordon AD. Classification. 2nd ed. St Andrews: Chapman and Hall/CRC; 1999.
  • Milligan G, Cooper M. An examination of procedures for determining the number of clusters in a data set. Psychometrika. 1985;50:159–179. doi: 10.1007/BF02294245
  • Milligan G, Cooper M. A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behav Res. 1986;21:441–458. doi: 10.1207/s15327906mbr2104_5
  • Koziol JA. Cluster analysis of antigenic profiles of tumors: selection of number of clusters using Akaike's information criterion. Methods Inf Med. 1990;29:200–204. doi: 10.1055/s-0038-1634783
  • Sugar CA, James GM. Finding the number of clusters in a data set: an information theoretic approach. J Am Stat Assoc. 2003;98:750–763. doi: 10.1198/016214503000000666
  • Banfield JD, Raftery AE. Model-based Gaussian and non-Gaussian clustering. Biometrics. 1993;49:803–821. doi: 10.2307/2532201
  • Fraley C, Raftery AE. How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J. 1998;41:578–588. doi: 10.1093/comjnl/41.8.578
  • Krolak-Schwerdt S, Eckes T. A graph theoretic criterion for determining the number of clusters in a data set. Multi Behavior Res. 1992;27:541–565. doi: 10.1207/s15327906mbr2704_3
  • Vassilliou A, Tambouratzis DG, Koutras MV, et al. A new similarity measure and its use in determining the number of clusters in a multivariate data set. Commun Stat Theory Method. 2004;33:1643–1666. doi: 10.1081/STA-120037266
  • Breckenridge JN. Replicating cluster analysis: method, consistency, and validity. Multivariate Behav Res. 1989;24:147–161. doi: 10.1207/s15327906mbr2402_1
  • Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of custers in a dataset. Genome Biol. 2002;3:1–21. doi: 10.1186/gb-2002-3-7-research0036
  • R Development Core Team. R: A language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria; 2011. ISBN 3-900051-07-0, Available from: http://www.R-project.org/.
  • Calinski RB, Harabasz J. A dendrite method for cluster analaysis. Commun Stat. 1974;3:1–27.
  • Krzanowski WJ, Lai YT. A criterion for determining the number of groups in a data set using sum of squares clustering. Biometrics. 1985;44:23–34. doi: 10.2307/2531893
  • Kaufman L, Rousseeuw PJ. Finding groups in data: an introduction to cluster analysis. New York: John Wiley & Sons; 1990.
  • Davies DL, Bouldin DW. A cluster separation measure. IEEE Trans Pattern Anal Mach Intell. 1979;2:224–227. doi: 10.1109/TPAMI.1979.4766909
  • Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. J Roy Stat Soc B. 2001;63:411–423. doi: 10.1111/1467-9868.00293
  • Sarle WS. Cubic clustering criterion. SAS Institute Inc, Cary, NC; 1983 (SAS Technical Report; A-108).
  • Ben-Hur A, Elisseeff A, Guyon I. A stability based method for discovering structure in clustered data. Pac Symp Biocomput. 2002;7:6–17.
  • Albatineh AN, Niewiadomska-Bugaj M. MCS: a method for finding the number of clusters. J Class. 2011a;28:184–209. doi: 10.1007/s00357-010-9069-1
  • Albatineh AN, Niewiadomska-Bugaj M. Correcting Jaccard and other similarity indices for chance agreement in cluster analysis. Adv Data Anal Class. 2011b;5:179–200. doi: 10.1007/s11634-011-0090-y
  • Jain AK, Dubes RC. Algorithms for clustering data. Englewood Cliffs (NJ): Prentice Hall; 1988.
  • Albatineh AN, Niewiadomska-Bugaj M, Mihalko DP. On similarity indices and correction for chance agreement. J Class. 2006;23:301–313. doi: 10.1007/s00357-006-0017-z
  • Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66:846–850. doi: 10.1080/01621459.1971.10482356
  • Jaccard P. The distribution of the flora of the alpine zone. New Phytol. 1912;11:37–50. doi: 10.1111/j.1469-8137.1912.tb05611.x
  • Albatineh AN. Means and variances for a class of similarity indices in cluster analysis. J Stat Plan Inference. 2010;140:2828–2838. doi: 10.1016/j.jspi.2010.03.005
  • Hubert L, Arabie P. Comparing partitions. J Class. 1985;2:193–218. doi: 10.1007/BF01908075
  • Morey L, Agresti A. The measurement of classification agreement: an adjustment to the rand statistic for chance agreement. Educ Psychol Meas. 1984;44:33–37. doi: 10.1177/0013164484441003
  • Rogers DJ, Tanimoto TT. A computer program for classifying plants. Science. 1960;132:1115–1118. doi: 10.1126/science.132.3434.1115
  • Sokal RR, Sneath PHA. Principles of numerical taxonomy. San Francisco: W H Freeman; 1963.
  • Gower JC, Legendre P. Metric and Euclidean properties of dissimilarity coefficients. J Class. 1986;3:5–48. doi: 10.1007/BF01896809
  • Azzalini A, Bowman AW. A look at some data on the old faithful geyser. Appl Stat. 1990;3:357–365. doi: 10.2307/2347385
  • Batschelet E. Circular statistics in biology. London: Academic Press; 1981.
  • Fisher NI. Statistical analysis of circular data. Cambridge: Cambridge University Press; 1993.
  • Mardia KV, Jupp PE. Directional statistics. Chichester: John Wiley & Sons; 2000.
  • Lund U. Cluster analysis for directional data. Commun Stat Simul Comput. 1999;4:1001–1009. doi: 10.1080/03610919908813589
  • Yang MS, Pan JA. On fuzzy clustering of directional data. Fuzzy Sets Syst. 1997;91:319–326. doi: 10.1016/S0165-0114(96)00157-1

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.