ABSTRACT
A modification of the Wu–Mendel approach for linguistic summarisation (LS) of datasets is proposed in this paper. The proposed modification is a fuzzification-tuning technique that tunes the originally user defined specification of a fuzzy linguistic variable for each initial dataset attribute in the Wu–Mendel approach. The implication of the proposed technique in LS significantly decreases the complexity of linguistic summaries in terms of the number of rules and linguistic terms without the essential loss in accuracy, which is verified by carrying out the corresponding experimental analysis involving several real-world datasets.
Disclosure statement
No potential conflict of interest was reported by the author.
Notes
1. Some researchers would rather prefer to use the ‘classification and regression’ expression instead of the ‘classification and prediction’, but we intend to keep the terminology close to the one in Wu and Mendel (Citation2011).
2. It should be noted that the experimental verification of the proposed fuzzification-tuning technique is carried out on the datasets with a single consequent attribute, so in (2) is equal to 1 in Section 4.
3. However, we do not claim the superiority of T1 over IT2 fuzzy logic, and the implementation of T1 fuzzy sets in this paper is motivated only by the intention to simplify the representation of the proposed fuzzification-tuning technique, which mathematical formulation is exactly the same in case of both fuzzy logic types.
4. In common, there is no necessity to implement this technique within the Wu–Mendel approach itself. We apply it only in Section 4 in order to create a LS-based fuzzy inference system without the conflicts in the rule base.
5. Attributes and the corresponding linguistic variables are denoted by the same symbols in this paper. For example, in (2) stands for both the attribute in
and the corresponding linguistic variable.
6. The degree of overlapping is calculated by using only the values, so there is no need in the other quality measures to be considered in this example.
7. In this paper, the ‘adjoining term couple’ expression means the couple containing two adjoining LTs. Therefore, the expression ‘adjoining term couples’ doesn’t mean the couples that are adjoining to each other.
8. We do not claim that the implemented technique for combining LTs is the only and/or the best possible one. According to the specific objectives, it can be chosen to combine LTs in some other way: for example, to merge two triangular LTs into a trapezoidal one or to use no triangular LTs at all. The current technique is chosen for its simplicity in both the calculation and the human perception.
9. There are two defined attribute types in the paper: ant. (stands for antecedent) and con. (stands for consequent).
10. The user-defined fuzzification of the initial dataset attributes is shown in –for the ‘Combined Cycle Power Plant’, ‘Banknote Authentication’ and ‘Skin Segmentation’ datasets, respectively.
11. The implication of a quality measure as the rule weight in not novel and performed, for example, in Ishibuchi and Yamamoto (Citation2005).
12. Because of 10-fold cross validation being applied, the testing RMSE values, the numbers of rules and the common numbers of linguistic terms demonstrated in this subsection are the mean ones.
13. The same statement is valid for every other dataset in Subsection 4.2.