References
- Abualigah LM, Khader AT, Al-Betar MA, et al. Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl. 2017;84:24–36.
- Bharti KK, Singh PK. Chaotic gradient artificial bee colony for text clustering. Soft Comput. 2016;20(3):1113–1126.
- Karaa WBA, Ashour AS, Sassi DB. Medline text mining: an enhancement genetic algorithm based approach for document clustering. In: Hassanien Aboul, Grosan Crina, Tolba FahmyMd., editors. In: Applications of intelligent optimization in biology and medicine. Springer; 2016. p. 267–287.
- Abualigah LM, Khader AT. Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput. 2017;73:4773–4795.
- Karol S, Mangat V. Evaluation of text document clustering approach based on particle swarm optimization. Open Comput Sci. 2013;3(2):69–90.
- Kanungo T, Mount DM, Netanyahu NS, et al. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell. 2002;24(7):881–892.
- Cutting DR, Karger DR, Pedersen JO. Scatter/gather: a cluster-based approach to browsing large document collections. ACM SIGIR forum. New York, USA, Vol. 51. ACM; 2017. p. 148–159.
- Aggarwal CC, Zhai CX. Mining text data. New York, USA: Springer; 2012.
- Kim HK, Kim H, Cho S. Bag-of-concepts: comprehending document representation through clustering words in distributed representation. Neurocomputing. 2017;266:336–352.
- Vijayarani S, Ilamathi J, Nithya S. Preprocessing techniques for text mining-an overview. Int J Comput Sci Commun Netw. 2015;5(1):7–16.
- Chen C-L, Tseng FSC, Liang T. An integration of wordnet and fuzzy association rule mining for multi-label document clustering. Data Knowl Eng. 2010;69(11):1208–1226.
- Beil F, Ester M, Xu X. Frequent term-based text clustering. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton Alberta Canada, ACM; July, 2002. p. 436–442.
- Mugunthadevi K, Punitha SC, Punithavalli M, et al. Survey on feature selection in document clustering. Int J Comput Sci Eng. 2011;3(3):1240–1241.
- Li M, Zhang L. Multinomial mixture model with feature selection for text clustering. Knowl Based Syst. 2008;21(7):704–708.
- Park S, An DU, Cheon CI. Document clustering method using weighted semantic features and cluster similarity. 2010 Third IEEE International Conference on Digital Game and Intelligent Toy Enhanced Learning. IEEE; Kaohsiung, Taiwan, 2010. p. 185–187.
- Grossman DA, Frieder O. Information retrieval: algorithms and heuristics. Vol. 15. New York: Springer Science & Business Media; 2012.
- Singh KN, Devi HM, Mahanta AK. Document representation techniques and their effect on the document clustering and classification: a review. Int J Adv Res Comput Sci. 2017;8(5):1780–1784.
- Abraham A, Das S, Konar A. Document clustering using differential evolution. IEEE Congress on Evolutionary Computation, Vancouver, Canada, CEC 2006. IEEE; 2006. p. 1784–1791.
- Bisht S, Paul A. Document clustering: a review. Int J Comput Appl. 2013;73(11):26–33.
- Abualigah LM, Khader AT, Hanandeh ES. A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell. 2018;73:111–125.
- Alshamiri AK, Singh A, Surampudi BR. Artificial bee colony algorithm for clustering: an extreme learning approach. Soft Comput. 2016;20(8):3163–3176.
- Forsati R, Mahdavi M, Shamsfard M, et al. Efficient stochastic algorithms for document clustering. Inf Sci. 2013;220:269–291.
- Ranjan R, Sahoo G. A new clustering approach for anomaly intrusion detection. arXiv preprint arXiv:1404.2772; 2014.
- Li M, Deng S, Wang L, et al. Hierarchical clustering algorithm for categorical data using a probabilistic rough set model. Knowl Based Syst. 2014;65:60–71.
- Celebi ME, Kingravi HA, Vela PA. A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl. 2013;40(1):200–210.
- Kim Y, Shim K, Kim M-S, et al. Dbcure-mr: an efficient density-based clustering algorithm for large data using mapreduce. Inf Syst. 2014;42:15–35.
- Popat SK, Emmanuel M. Review and comparative study of clustering techniques. Int J Comput Sci Inf Technol. 2014;5(1):805–812.
- Duwairi R, Abu-Rahmeh M. A novel approach for initializing the spherical k-means clustering algorithm. Simul Model Pract Theory. 2015;54:49–63.
- Han J, Kamber M, Pei J. Data mining: concepts and techniques. Waltham, USA: Morgan kaufmann; 2006.
- Cimiano P, Hotho A, Staab S. Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. ECAI., Valencia, Spain, Vol. 16; 2004. p. 435.
- Xu R, Xu J, Wunsch DC. Clustering with differential evolution particle swarm optimization. 2010 IEEE Congress on Evolutionary Computation (CEC)., Barcelona, Spain, IEEE; 2010. p. 1–8.
- Nanda SJ, Panda G. A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol Comput. 2014;16:1–18.
- Cui X, Potok TE. Document clustering analysis based on hybrid pso+ k-means algorithm. J Comput Sci. 2005;27:33.
- Guha S, Mishra N. Clustering data streams. In: Data stream management; Springer, Berline, Heidelberg; 2016. p. 169–187.
- Wang J, Su X. An improved k-means clustering algorithm. IEEE 3rd International Conference on Communication Software and Networks (ICCSN). IEEE; Xi'an, China 2011. p. 44–46.
- Velmurugan T. Performance based analysis between k-means and fuzzy c-means clustering algorithms for connection oriented telecommunication data. Appl Soft Comput. 2014;19:134–146.
- Pujari AK. Data mining techniques. Hyderabad, India: Universities Press; 2001.
- Firdaus S, Uddin A. A survey on clustering algorithms and complexity analysis. Int J Comput Sci Issues. 2015;12(2):62.
- Banerjee A, Merugu S, Dhillon IS, et al. Clustering with bregman divergences. J Mach Learn Res. 2005;6:1705–1749.
- Chawla S, Gionis A. k-means–: a unified approach to clustering and outlier detection. Proceedings of the 2013 SIAM International Conference on Data Mining. SIAM; Austin, Texas, 2013. p. 189–197.
- Song W, Li CH, Park SC. Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures. Expert Syst Appl. 2009;36(5):9095–9104.
- Song W, Qiao Y, Park SC, et al. A hybrid evolutionary computation approach with its application for optimizing text document clustering. Expert Syst Appl. 2015;42(5):2517–2524.
- Sridhar S, Dunham MH. Data mining, introduction and advanced topics. New Delhi, India: Prentice Hall Publication; 2013.
- Saravanan D, Srinivasan S. Video data mining information retrieval using birch clustering technique. Artificial Intelligence and Evolutionary Algorithms in Engineering Systems. Springer; Kumaracoil, India, 2015. p. 583–594.
- Mansoori EG. Gach: a grid-based algorithm for hierarchical clustering of high-dimensional data. Soft Comput. 2014;18(5):905–922.
- Ferrari DG, De Castro LN. Clustering algorithm selection by meta-learning systems: a new distance-based problem characterization and ranking combination methods. Inf Sci. 2015;301:181–194.
- Xu D, Tian Y. A comprehensive survey of clustering algorithms. Ann Data Sci. 2015;2(2):165–193.
- Agnihotri D, Verma K, Tripathi P. Pattern and cluster mining on text data. 2014 Fourth International Conference on Communication Systems and Network Technologies (CSNT). IEEE; Bhopal, India 2014. p. 428–432.
- Sneath P, Sokal R. Unweighted pair group method with arithmetic mean. In: Numerical taxonomy. 1973. Springer, Berlin, p. 230–234.
- Wang X, Qian B, Davidson I. On constrained spectral clustering and its applications. Data Min Knowl Discov. 2014;28:1–30.
- Peng B, Zhang L, Zhang D. A survey of graph theoretical approaches to image segmentation. Pattern Recognit. 2013;46(3):1020–1038.
- Karaboga D, Ozturk C. A novel clustering approach: artificial bee colony (ABC) algorithm. Appl Soft Comput. 2011;11(1):652–657.
- Forsati R, Keikha A, Shamsfard M. An improved bee colony optimization algorithm with an application to document clustering. Neurocomputing. 2015;159:9–26.
- Gomaa WH, Fahmy AA. A survey of text similarity approaches. Int J Comput Appl. 2013;68(13):13–18.
- Maroosi A, Amiri B. A new clustering algorithm based on hybrid global optimization based on a dynamical systems approach algorithm. Expert Syst Appl. 2010;37(8):5645–5652.
- Xiang S, Nie F, Zhang C. Learning a mahalanobis distance metric for data clustering and classification. Pattern Recognit. 2008;41(12):3600–3612.
- Bharti KK, Singh PK. Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Syst Appl. 2015;42(6):3105–3114.
- Das S, Abraham A, Konar A. Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybernet A. 2008;38(1):218–237.
- Williams RJ, Volberg RA. The classification accuracy of four problem gambling assessment instruments in population research. Int Gambl Stud. 2014;14(1):15–28.
- Ma B, Yuan H, Wu Y. Exploring performance of clustering methods on document sentiment analysis. J Inf Sci. 2017;43(1):54–74.
- Hai Z, Chang K, Kim J-J, et al. Identifying features in opinion mining via intrinsic and extrinsic domain relevance. IEEE Trans Knowl Data Eng. 2014;26(3):623–634.
- Yau C-K, Porter A, Newman N, et al. Clustering scientific documents with topic modeling. Scientometrics. 2014;100(3):767–786.
- Jin C, Jin S. Automatic image annotation using feature selection based on improving quantum particle swarm optimization. Signal Processing. 2015;109:172–181.
- Manning CD, Raghavan P, Schütze H. Introduction to information retrieval. Vol. 1. Cambridge: Cambridge University Press; 2008.
- Mustafi D, Sahoo G, Mustafi A. A multi criteria document clustering approach using genetic algorithm. In: Computational intelligence in data mining. Vol. 1. Springer; New Delhi, 2016. p. 237–247.
- Munková D, Munk M, Vozár M. Data pre-processing evaluation for text mining: transaction/sequence model. Procedia Comput Sci. 2013;18:1198–1207.
- Wei T, Lu Y, Chang H, et al. A semantic approach for text clustering using wordnet and lexical chains. Expert Syst Appl. 2015;42(4):2264–2275.
- Witten IH, Frank E, Hall MA. Data mining: practical machine learning tools and techniques. San Fransisco, US: Morgan Kaufmann; 2016.
- Dagher GG, Fung BCM. Subject-based semantic document clustering for digital forensic investigations. Data Knowl Eng. 2013;86:224–241.
- Feinerer I, Buchta C, Geiger W, et al. The textcat package for n-gram based text categorization in R. J Stat Softw. 2013;52(6):1–17.
- Graovac J. A variant of n-gram based language-independent text categorization. Intell Data Anal. 2014;18(4):677–695.
- Salton G, Wong A, Yang C-S. A vector space model for automatic indexing. Commun ACM. 1975;18(11):613–620.
- Nasir JA, Varlamis I, Karim A, et al. Semantic smoothing for text clustering. Knowl Based Syst. 2013;54:216–229.
- Grossman DA. Information retrieval: algorithms and heuristics. Vol. 15. New York: Springer; 2004.
- Haddi E, Liu X, Shi Y. The role of text pre-processing in sentiment analysis. Procedia Comput Sci. 2013;17:26–32.
- Shlens J. A tutorial on principal component analysis. Vol. 82. San Diego: Systems Neurobiology Laboratory, University of California at San Diego; 2005.
- Chua FCT. Dimensionality reduction and clustering of text documents. Singapore: Singapore Management University; 2009.
- Deerwester S, Dumais ST, Furnas GW, et al. Indexing by latent semantic analysis. J Am Soc Inf Sci. 1990;41(6):391.
- Rana C, Jain SK. An evolutionary clustering algorithm based on temporal features for dynamic recommender systems. Swarm Evol Comput. 2014;14:21–30.
- Aliguliyev RM. Clustering techniques and discrete particle swarm optimization algorithm for multi-document summarization. Comput Intell. 2010;26(4):420–448.
- Maulik U, Bandyopadhyay S. Genetic algorithm-based clustering technique. Pattern Recognit. 2000;33(9):1455–1465.
- Premalatha K, Natarajan AM. Genetic algorithm for document clustering with simultaneous and ranked mutation. Mod Appl Sci. 2009;3(2):75–82.
- Denoeux T, Kanjanatarakul O, Sriboonchitta S. Ek-nnclus: a clustering procedure based on the evidential k-nearest neighbor rule. Knowl Based Syst. 2015;88:57–69.
- Chou C-H, Hsieh Y-Z, Su M-C, et al. A new measure of cluster validity using line symmetry. J Inf Sci Eng. 2014;30(2):443–461.
- Vdorhees EM. The cluster hypothesis revisited. ACM SIGIR forum. Japan, Vol. 51. ACM; 2017. p. 35–43.
- Caballero R, Laguna M, Martí R, et al. Multiobjective clustering with metaheuristic optimization technology. Valencia, España: Departamento de Estadística e Investigación Operativa, Universidad de Valencia; 2006. (Reporte Técnico).
- Arbelaitz O, Gurrutxaga I, Muguerza J, et al. An extensive comparative study of cluster validity indices. Pattern Recognit. 2013;46(1):243–256.