LG AIAug 12, 2024

Attention Please: What Transformer Models Really Learn for Process Prediction

Martin Käppel, Lars Ackermann, Stefan Jablonski, Simon Härtl

arXiv:2408.07097v16.45 citationsh-index: 10Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses interpretability in predictive process monitoring for business analysts, but it is incremental as it applies existing attention mechanisms to a specific domain.

The paper investigates whether attention scores in transformer models for next-activity prediction in business processes can explain decision-making, finding they can serve as explainers and proposing graph-based explanation approaches.

Predictive process monitoring aims to support the execution of a process during runtime with various predictions about the further evolution of a process instance. In the last years a plethora of deep learning architectures have been established as state-of-the-art for different prediction targets, among others the transformer architecture. The transformer architecture is equipped with a powerful attention mechanism, assigning attention scores to each input part that allows to prioritize most relevant information leading to more accurate and contextual output. However, deep learning models largely represent a black box, i.e., their reasoning or decision-making process cannot be understood in detail. This paper examines whether the attention scores of a transformer based next-activity prediction model can serve as an explanation for its decision-making. We find that attention scores in next-activity prediction models can serve as explainers and exploit this fact in two proposed graph-based explanation approaches. The gained insights could inspire future work on the improvement of predictive business process models as well as enabling a neural network based mining of process models from event logs.

View on arXiv PDF Code

Similar