ABSTRACT
Process mining provides analytical tools and methods which can distil insights about process behaviour from big process-related data. Yet challenges relating to the impact of poor quality data on event logs, the input to process mining analyses, remain. Despite researchers raising concerns about event log data quality, event log preparation is, in practice, generally handled mechanistically, focusing on fixing symptoms rather than on uncovering the root causes of event log data quality issues. To address this, we introduce the Odigos (Greek for “guide”) framework. Based on semiotics and Peircean abductive reasoning, the Odigos framework facilitates an informed way of dealing with data quality issues in event logs. Odigos supports both prognostic (foreshadowing potential quality issues) and diagnostic (identifying root causes of discovered quality issues) approaches. We examine in depth how the framework supports a detailed root-cause analysis of a well-known collection of event log imperfection patterns.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. Refers to a high level of maturity (methodological rigour) and a consideration of the organisational context being evident in process mining case studies.
2. According to Danermark et al. (Citation2001) to be able to guide the explanatory research agenda, the nature of the phenomenon and the entities involved in analysis of the phenomenon should be first foregrounded. The theoretical framework proposed in this study is providing this ontological foundation to guide researchers in analysing data quality in event logs.
3. Greek for “guide”
4. Note that, for the internal arrows in , we have considered only the interactions towards the semiosis content, since in this study we are interested in understanding the creation and root causes of quality issues in the event logs.
5. Different information systems, with different levels of automation, are included in this definition. In a fully automated environment, the role of process participant changes but is never diminished.
6. National Health Reform Agreement 2011
7. National Emergency Access Target (NEAT)
8. As we explain later, this framework does not present a complete list of data quality problems and their causes but supports an analytical approach for identifying and analysing those problems