ABSTRACT
In this study, a new perspective on the application of the clustering approach is proposed. The perspective aims to identify the values of the parameters of clustering, including the choice of the algorithm itself, which lead to a possibly faithful rendering of a partition of data, which is known a priori. Motivation and possible interpretations are discussed which can be associated with such a reverse identification process. The essential motivation is associated, but not limited, to the primary objective of cluster analysis, i.e. gaining insight into the structure of the given data-set or family of data-sets. We propose to use evolutionary strategies for reverse analysis to be carried out in view of the characteristics of the problem considered. The concept and the feasibility of the proposed computational approach are illustrated by the analysis of an exemplary data-set. The preliminary results obtained are promising in both technical and cognitive terms.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. We put aside here the considerations, involving the direct study of relations between the labels of xi, associated with PA, and the characteristics of xi, and we assume that the task at hand replaces such a study.
2. Actually, even in the ‘absolute’ case, doubts may arise, if the situation resembles the one of multiple overlapping distributions.
3. Let us note that the notions like ‘reverse clustering’, ‘inverted clustering’, etc., suggesting some sort of a ‘backward’ procedure, which appear in the literature, actually refer to different kinds of problems, ranging from architectures of computational clusters, to reasoning, concerning the processes behind the data-set, based on the output from clustering (see e.g. D'haeseleer, Liang, and Somogyi 2000).