학술논문

Process Duration Modelling and Concept Drift Detection for Business Process Mining
Document Type
Conference
Source
2021 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI) SMARTWORLD-SCALCOM-UIC-ATC-IOP-SCI SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI), 2021 IEEE. :653-658 Oct, 2021
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Robotics and Control Systems
Signal Processing and Analysis
Transportation
Histograms
Technological innovation
Smart cities
Hospitals
Decision making
Process control
Mixture models
Business process
Process duration
Gamma mixture model
EM algorithm
Concept drift
Language
Abstract
Customer behaviour within business processes can change over time, making it difficult for market understanding and decision making. Detecting such variations, also referred to as concept drift, can provide insight into the evolution of the business environment, offer opportunities for model refinement and provide target-oriented services to improve customer satisfaction. Concept drift in the control-flow perspective has been extensively studied but there is a research gap in detecting process duration drift. In this paper, we use gamma mixture models (GMMs) with an expectation-maximization (EM) algorithm to fit process durations and then detect variations in their histogram, density and cumulative distributions. Specifically, three metrics: the overall difference in back-to-back histograms, the Kullback-Leibler (KL) divergence and the maximum difference in cumulative distributions are used to evaluate how different the process durations are. Furthermore, three corresponding statistical tests: the multinomial test, log-likelihood ratio (LLR) test and Kolmogorov-Smirnov (KS) test are applied to determine whether, or not, the differences are statistically significant. The approach is applied to a public real-life hospital billing process where two concept drift occurrences are discovered. The main contribution of this paper is the approach aiming for detecting process duration changes.