1,619
Views
3
CrossRef citations to date
0
Altmetric
Research Articles

Root-cause analysis of process-data quality problems

ORCID Icon, , &
Pages 51-75 | Received 27 Nov 2020, Accepted 18 Jun 2021, Published online: 31 Aug 2021
 

ABSTRACT

Process mining provides analytical tools and methods which can distil insights about process behaviour from big process-related data. Yet challenges relating to the impact of poor quality data on event logs, the input to process mining analyses, remain. Despite researchers raising concerns about event log data quality, event log preparation is, in practice, generally handled mechanistically, focusing on fixing symptoms rather than on uncovering the root causes of event log data quality issues. To address this, we introduce the Odigos (Greek for “guide”) framework. Based on semiotics and Peircean abductive reasoning, the Odigos framework facilitates an informed way of dealing with data quality issues in event logs. Odigos supports both prognostic (foreshadowing potential quality issues) and diagnostic (identifying root causes of discovered quality issues) approaches. We examine in depth how the framework supports a detailed root-cause analysis of a well-known collection of event log imperfection patterns.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

1. Refers to a high level of maturity (methodological rigour) and a consideration of the organisational context being evident in process mining case studies.

2. According to Danermark et al. (Citation2001) to be able to guide the explanatory research agenda, the nature of the phenomenon and the entities involved in analysis of the phenomenon should be first foregrounded. The theoretical framework proposed in this study is providing this ontological foundation to guide researchers in analysing data quality in event logs.

3. Greek for “guide”

4. Note that, for the internal arrows in , we have considered only the interactions towards the semiosis content, since in this study we are interested in understanding the creation and root causes of quality issues in the event logs.

5. Different information systems, with different levels of automation, are included in this definition. In a fully automated environment, the role of process participant changes but is never diminished.

6. National Health Reform Agreement 2011

7. National Emergency Access Target (NEAT)

8. As we explain later, this framework does not present a complete list of data quality problems and their causes but supports an analytical approach for identifying and analysing those problems

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.