Discovering expressive process models by clustering log traces

被引:199
作者
Greco, Gianluigi
Guzzo, Antonella
Pontieri, Luigi
Sacca, Domenico
机构
[1] Univ Calabria, Dept Math, I-87036 Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, CNR, ICAR, Inst High Performance Comp & Networks, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Calabria, DEIS, I-87036 Arcavacata Di Rende, CS, Italy
关键词
process mining; data mining; workflow management; clustering; classification; association rules;
D O I
10.1109/TKDE.2006.123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Process mining techniques have recently received notable attention in the literature for their ability to assist in the ( re) design of complex processes by automatically discovering models that explain the events registered in some log traces provided as input. Following this line of research, the paper investigates an extension of such basic approaches, where the identification of different variants for the process is explicitly accounted for, based on the clustering of log traces. Indeed, modeling each group of similar executions with a different schema allows us to single out "conformant" models, which, specifically, minimize the number of modeled enactments that are extraneous to the process semantics. Therefore, a novel process mining framework is introduced and some relevant computational issues are deeply studied. As finding an exact solution to such an enhanced process mining problem is proven to require high computational costs, in most practical cases, a greedy approach is devised. This is founded on an iterative, hierarchical, refinement of the process model, where, at each step, traces sharing similar behavior patterns are clustered together and equipped with a specialized schema. The algorithm guarantees that each refinement leads to an increasingly sound model, thus attaining a monotonic search. Experimental results evidence the validity of the approach with respect to both effectiveness and scalability.
引用
收藏
页码:1010 / 1027
页数:18
相关论文
共 29 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[3]  
Agrawal R, 1994, P 20 INT C VER LARG, V1215, P487
[4]  
[Anonymous], [No title captured], DOI DOI 10.1145/347090.347169
[5]  
[Anonymous], 2000, P 6 ACM SIGKDD INT C
[6]  
[Anonymous], 2004, BETA WORKING PAPER S
[7]  
COOK JE, 1995, PROC INT CONF SOFTW, P73, DOI 10.1145/225014.225021
[8]   Software process validation: Quantitatively measuring the correspondence of a process to a model [J].
Cook, JE ;
Wolf, AL .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 1999, 8 (02) :147-176
[9]   Mining and reasoning on workflows [J].
Greco, G ;
Guzzo, A ;
Manco, G ;
Saccà, D .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) :519-534
[10]  
Greco G, 2005, LECT NOTES COMPUT SC, V3649, P32, DOI 10.1007/11538394_3