Process mining : A two-step approach to balance between underfitting and overfitting

Publication: Research - peer-reviewJournal article – Annual report year: 2010

Standard

Process mining : A two-step approach to balance between underfitting and overfitting. / van der Aalst, W.M.P.; Rubin, V.; Verbeek, H.M.W.; van Dongen, B.F.; Kindler, Ekkart; Günther, C.W.

In: Journal of Software and Systems Modeling, Vol. 9, No. 1, 2010, p. 87-111.

Publication: Research - peer-reviewJournal article – Annual report year: 2010

Harvard

van der Aalst, WMP, Rubin, V, Verbeek, HMW, van Dongen, BF, Kindler, E & Günther, CW 2010, 'Process mining: A two-step approach to balance between underfitting and overfitting' Journal of Software and Systems Modeling, vol 9, no. 1, pp. 87-111., 10.1007/s10270-008-0106-z

APA

van der Aalst, W. M. P., Rubin, V., Verbeek, H. M. W., van Dongen, B. F., Kindler, E., & Günther, C. W. (2010). Process mining: A two-step approach to balance between underfitting and overfitting. Journal of Software and Systems Modeling, 9(1), 87-111. 10.1007/s10270-008-0106-z

CBE

van der Aalst WMP, Rubin V, Verbeek HMW, van Dongen BF, Kindler E, Günther CW. 2010. Process mining: A two-step approach to balance between underfitting and overfitting. Journal of Software and Systems Modeling. 9(1):87-111. Available from: 10.1007/s10270-008-0106-z

MLA

Vancouver

van der Aalst WMP, Rubin V, Verbeek HMW, van Dongen BF, Kindler E, Günther CW. Process mining: A two-step approach to balance between underfitting and overfitting. Journal of Software and Systems Modeling. 2010;9(1):87-111. Available from: 10.1007/s10270-008-0106-z

Author

van der Aalst, W.M.P.; Rubin, V.; Verbeek, H.M.W.; van Dongen, B.F.; Kindler, Ekkart; Günther, C.W. / Process mining : A two-step approach to balance between underfitting and overfitting.

In: Journal of Software and Systems Modeling, Vol. 9, No. 1, 2010, p. 87-111.

Publication: Research - peer-reviewJournal article – Annual report year: 2010

Bibtex

@article{2e04344c0e9f4f4892ef85b3f3d1c4d2,
title = "Process mining",
publisher = "Springer",
author = "{van der Aalst}, W.M.P. and V. Rubin and H.M.W. Verbeek and {van Dongen}, B.F. and Ekkart Kindler and C.W. Günther",
year = "2010",
doi = "10.1007/s10270-008-0106-z",
volume = "9",
number = "1",
pages = "87--111",
journal = "Journal of Software and Systems Modeling",
issn = "1619-1366",

}

RIS

TY - JOUR

T1 - Process mining

T2 - A two-step approach to balance between underfitting and overfitting

A1 - van der Aalst,W.M.P.

A1 - Rubin,V.

A1 - Verbeek,H.M.W.

A1 - van Dongen,B.F.

A1 - Kindler,Ekkart

A1 - Günther,C.W.

AU - van der Aalst,W.M.P.

AU - Rubin,V.

AU - Verbeek,H.M.W.

AU - van Dongen,B.F.

AU - Kindler,Ekkart

AU - Günther,C.W.

PB - Springer

PY - 2010

Y1 - 2010

N2 - Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such “overfitting” by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about completeness. As a result, parts of the model are “overfitting” (allow only for what has actually been observed) while other parts may be “underfitting” (allowfor much more behavior without strong support for it). None of the existing techniques enables the user to control the balance between “overfitting” and “underfitting”. To address this, we propose a two-step approach. First, using a configurable approach, a transition system is constructed. Then, using the “theory of regions”, the model is synthesized. The approach has been implemented in the context of ProM and overcomes many of the limitations of traditional approaches.

AB - Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such “overfitting” by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about completeness. As a result, parts of the model are “overfitting” (allow only for what has actually been observed) while other parts may be “underfitting” (allowfor much more behavior without strong support for it). None of the existing techniques enables the user to control the balance between “overfitting” and “underfitting”. To address this, we propose a two-step approach. First, using a configurable approach, a transition system is constructed. Then, using the “theory of regions”, the model is synthesized. The approach has been implemented in the context of ProM and overcomes many of the limitations of traditional approaches.

UR - http://www.springerlink.com/content/u43v780550278h4l/

U2 - 10.1007/s10270-008-0106-z

DO - 10.1007/s10270-008-0106-z

JO - Journal of Software and Systems Modeling

JF - Journal of Software and Systems Modeling

SN - 1619-1366

IS - 1

VL - 9

SP - 87

EP - 111

ER -