169
Views
10
CrossRef citations to date
0
Altmetric
Original Article

Modification of the Wu-Mendel approach for linguistic summarization

ORCID Icon &
Pages 77-97 | Received 20 Mar 2018, Accepted 19 Aug 2018, Published online: 03 Oct 2018
 

ABSTRACT

This paper presents versatile modification of the Wu-Mendel approach for linguistic summarisation and concentrates on eliminating the following main drawbacks of the approach: significant user/expert involvement, significant and rapidly increasing computational cost, some flaws in quality measures formulae. The techniques and formulae proposed to eliminate the aforesaid drawbacks are put through the experimental verification process involving several real-world datasets. The results of the experimental verification demonstrate the increased efficiency and effectiveness of the modified Wu-Mendel approach for linguistic summarisation.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1. Actually, there are a few more shortcomings, not considered in this paper at all, to be noted: the lack of linguistic summary optimisation techniques, the lack of the techniques making the Wu-Mendel approach incremental, etc.

2. There is no explicit explanation of how to deal with conflicting rules in Wu and Mendel (Citation2011), so we use the technique implemented in Wang-Mendel method (Wang & Mendel, Citation1992), which is mentioned in Wu and Mendel (Citation2011) itself.

3. For example, LS by ES of the whole ‘SUSY’ dataset (Whiteson, Citation2014b) with each antecedent attribute declared as a variable having only two fuzzy terms would take a couple of weeks (approximate experiment-based estimation obtained by the authors) to be performed on the current operating machine.

4. In this paper, the size of initial data is equated to the number of records in the dataset. However, we admit that the size and complexity of data depend not only (and even not primarily) on the number of records but also on the number of attributes.

5. (Kiˆ,Ljˆ) is Ki,Lj after pruning low-support term sets.

6. The implication of a quality measure as the rule weight is not novel and performed, for example, in Ishibuchi and Yamamoto (2005).

7. The 10 training and testing subsets of each considered initial dataset are the same in all subsections of section 5.

8. In some cases, linguistic summaries obtained when implementing user-defined rcmax and rcmincontain too few rules or even do not contain any rule at all. For example, LS of the ‘Combined Cycle Power Plant’ with rcmax=0.15 and rcmin=0.02 results in obtaining 0 rules having R=1 ().

9. The reasonableness of this choice is substantiated in Subsection on ‘Verification of the proposed rcmax and rcmin formulae’.

10. The preceding Wu-Mendel approach is the approach implementing original quality measures formulae but computing rcmax and rcmin by using proposed (13), (14), (15).

11. The reasonableness of this choice is substantiated in Subsection on ‘Verification of the proposed quality measures formulae’.

12. Since the clustering in DRC is applied to data not during but before LS, clustering time is not considered as a part of LS time and not included in the corresponding figures. For reference, the DRC implementation on a ‘Skin Segmentation’ training subset took approximately 31–36 s, on a ‘HTRU 2’ training subset – 3–4 s.

13. Since the computational cost of LS in this paper depends on the size of initial data and the number of processed rules (as stated in Subsection on ‘Computational cost reduction techniques’), the proposed computational cost reduction techniques should be verified on the datasets of a relatively large size (the ‘Skin Segmentation’) and/or having a relatively extensive number of the corresponding rules processed by the ES technique (the ‘HTRU 2’).

14. A brief description of the corresponding issues is given in Subsection on ‘The investigation of user/expert involvement issue in the Wu-Mendel approach’.

15. The same statement is valid for every other dataset in Subsection on ‘The datasets used in experimental verification’.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 373.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.