109
Views
9
CrossRef citations to date
0
Altmetric
Original Articles

Performance-friendly rule extraction in large water data-sets with AOC posets and relational concept analysis

, , &
Pages 187-210 | Received 15 Jun 2014, Accepted 15 Dec 2014, Published online: 13 Jan 2016
 

Abstract

In this paper, we consider data analysis methods for knowledge extraction from large water data-sets. More specifically, we try to connect physico-chemical parameters and the characteristics of taxons living in sample sites. Among these data analysis methods, we consider formal concept analysis (FCA), which is a recognized tool for classification and rule discovery on object–attribute data. Relational concept analysis (RCA) relies on FCA and deals with sets of object–attribute data provided with relations. RCA produces more informative results but at the expense of an increase in complexity. Besides, in numerous applications of FCA, the partially ordered set of concepts introducing attributes or objects (AOC poset, for Attribute–Object–Concept poset) is used rather than the concept lattice in order to reduce combinatorial problems. AOC posets are much smaller and easier to compute than concept lattices and still contain the information needed to rebuild the initial data. This paper introduces a variant of the RCA process based on AOC posets rather than concept lattices. This approach is compared with RCA based on iceberg lattices. Experiments are performed with various scaling operators, and a specific operator is introduced to deal with noisy data. We show that using AOC poset on water data-sets provides a reasonable concept number and allows us to extract meaningful implication rules (association rules whose confidence is 1), whose semantics depends on the chosen scaling operator.

Acknowledgements

We warmly acknowledge Karell Bertet for sharing part of her knowledge on implication rules and Alain Gutierrez (LIRMM) for the implementation of the AOC poset algorithm (http://www.lirmm.fr/AOC-poset-Builder/).

Notes

2 Which has no identical rows, no identical columns, no row which is the intersection of several other rows and no column which is the intersection of several other columns (Ganter and Wille Citation1999).

3 Elements of the lattice with a unique successor, while considering ascending order in our diagram representations: lowest elements are below greatest elements.

5 In the following both physical and physico-chemical parameters are called physico-chemical parameters.

6 Experiments were done on a Processor Intel®CoreTM2 Duo CPU L9600 @ 2.13GHz × 2 and 1.7 GiB of RAM.

Additional information

Funding

This work was funded by ANR11_MONU14

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 949.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.