Elena Bellodi, Fabrizio Riguzzi and Evelina Lamma
ENDIF – Università di Ferrara – Via Saragat, 1 – 44122 Ferrara, Italy.
Organizations usually rely on a number of processes to achieve their mission, which describe the way resources are exploited. Formal ways of representing business processes have been studied in the so-called area of Business Processes Management (BPM). Recently, the problem of automatically mining a structured description of a business process directly from real data has been studied by many authors. The data consist of execution traces of the process and are collected by information systems which log the activities performed by the users. This problem has been called Process Mining. Recently new declarative languages have been proposed to express only constraints on process execution.
In particular SCIFF adopts first-order logic in order to represent the constraints. A trace t is a sequence of events, described by a number of attributes. A bag of process traces L is called a log. The aim of Process Mining is to infer a process model from a log. A process trace can be represented as a logical interpretation (set of ground atoms): each event is modeled with an atom whose predicate is the event type and whose arguments store the attributes. A process model in SCIFF language is a set of Integrity Constraints(ICs). The theory (or model) composed of all the ICs must be such that all the ICs are true when considering a positive trace and at least one IC is false when considering a negative one. The algorithm Declarative Process Model Learner (DPML) finds an IC theory solving the learning problem.
At this point we investigate the possibility of encoding probabilistic information in the IC theory with Markov Logic, a language extending first-order logic. ML allows to attach weights to ICs by means of the Alchemy system. In the infinite-weight limit, ML reduces to standard first-order logic. The resulting set of couples (weight,formula) is called Markov Logic Network (MLN). A set of ICs can be seen as a “hard” theory: if a world violates even one formula, it is considered impossible; in ML it is less probable, but not impossible. The weight associated to each formula reflects how strong the constraint is.
Our goal is to demonstrate that the combined use of DPML, for learning an IC theory, and Alchemy, for
learning weights for formulas, produces better results than the sharp classification realized by the SCIFF theory.
Conclusions. The probabilistic classification made with Alchemy is more accurate than the pure logical one made only by SCIFF.