24
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

DisCoSet: Discovery of Contrast Sets to Reduce Dimensionality and Improve Classification

&
Pages 1178-1191 | Received 30 Oct 2014, Accepted 13 Oct 2015, Published online: 13 Nov 2015
 

Abstract

Traditionally, contrast set mining aims at finding a set of rules that best distinguish the instances of different user-defined groups. Contrast sets are conjunctions of attribute-value pairs that are significantly more frequent in one group than in other groups. Typically, these contrast sets are extracted from categorical data or discretized numerical data. Existing methods of rule-based contrast sets require some user-defined thresholds to select the contrast sets. In this paper, we propose a greedy algorithm, called DisCoSet, to find incrementally a minimum set of local features that best distinguishes a class from other classes without resorting to discretization. The discovered contrast sets reduce the dimensionality of the feature vectors considerably and improve the classification accuracy significantly. We show that the proposed algorithm reduces the dimensionality of class instances by 40%-97% of the original length and yet improves classification accuracy by 10%-24% using different types of datasets.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.