Process mining : A two-step approach to balance between underfitting and overfitting
Publication: Research - peer-review › Journal article – Annual report year: 2010
Process mining includes the automated discovery
of processes from event logs. Based on observed events (e.g.,
activities being executed or messages being exchanged) a
process model is constructed. One of the essential problems
in process mining is that one cannot assume to have seen all
possible behavior. At best, one has seen a representative subset.
Therefore, classical synthesis techniques are not suitable
as they aim at finding a model that is able to exactly reproduce
the log. Existing process mining techniques try to avoid such
“overfitting” by generalizing the model to allow for more
behavior. This generalization is often driven by the representation
language and very crude assumptions about completeness.
As a result, parts of the model are “overfitting” (allow
only for what has actually been observed) while other parts
may be “underfitting” (allowfor much more behavior without
strong support for it). None of the existing techniques enables
the user to control the balance between “overfitting” and
“underfitting”. To address this, we propose a two-step
approach. First, using a configurable approach, a transition
system is constructed. Then, using the “theory of regions”,
the model is synthesized. The approach has been implemented
in the context of ProM and overcomes many of the
limitations of traditional approaches.
| Original language | English |
|---|---|
| Journal | Journal of Software and Systems Modeling |
| Publication date | 2010 |
| Volume | 9 |
| Journal number | 1 |
| Pages | 87-111 |
| ISSN | 1619-1366 |
| DOIs | |
| State | Published |
| Citations | Web of Science® Times Cited: 13 |
|---|
ID: 4331517