Abstract
Two novel approaches to triclustering of three-way binary data are proposed. Tricluster is defined as a dense subset of a ternary relation Y defined on sets of objects, attributes, and conditions, or, equivalently, as a dense submatrix of the adjacency matrix of the ternary relation Y. This definition is a scalable relaxation of the notion of triconcept in Triadic Concept Analysis, whereas each triconcept of the initial data-set is contained in a certain tricluster. This approach generalizes the one previously introduced for concept-based biclustering. We also propose a hierarchical spectral triclustering algorithm for mining dense submatrices of the adjacency matrix of the initial ternary relation Y. Finally, we describe some applications of the proposed techniques, compare proposed approaches and study their performance in a series of experiments with real data-sets.
Acknowledgments
The study was implemented in the framework of the Basic Research Program at the National Research University Higher School of Economics in 2012 and in the Laboratory of Intelligent Systems and Structural Analysis. At the early stage, this work was also partially supported by the Russian Foundation for Basic Research, project No. 08-07-92497-NTSNIL_a. We would like to thank our colleagues Paul Elzinga (Amsterdam–Amstelland Police) who allowed us to experiment with the police data, Ruslan Magizov (NRU HSE) for his programming activity at the early stages of the research, and Zarina Sekinaeva who developed the program for SpecTric algorithm as a part of her master’s project. We are specially thankful to Dmitry Gnatyshak for his programming and experimental support and to Dominik lȩzak for his patience and willingness to constantly stimulate us during the research.