LGNov 29, 2022
Encoder-Decoder Model for Suffix Prediction in Predictive MonitoringEfrén Rama-Maneiro, Pablo Monteagudo-Lago, Juan C. Vidal et al.
Predictive monitoring is a subfield of process mining that aims to predict how a running case will unfold in the future. One of its main challenges is forecasting the sequence of activities that will occur from a given point in time -- suffix prediction -- . Most approaches to the suffix prediction problem learn to predict the suffix by learning how to predict the next activity only, not learning from the whole suffix during the training phase. This paper proposes a novel architecture based on an encoder-decoder model with an attention mechanism that decouples the representation learning of the prefixes from the inference phase, predicting only the activities of the suffix. During the inference phase, this architecture is extended with a heuristic search algorithm that improves the selection of the activity for each index of the suffix. Our approach has been tested using 12 public event logs against 6 different state-of-the-art proposals, showing that it significantly outperforms these proposals.
AIJul 22, 2022
Gradual Drift Detection in Process Models Using Conformance MetricsVictor Gallego-Fontenla, Juan C. Vidal, Manuel Lama
Changes, planned or unexpected, are common during the execution of real-life processes. Detecting these changes is a must for optimizing the performance of organizations running such processes. Most of the algorithms present in the state-of-the-art focus on the detection of sudden changes, leaving aside other types of changes. In this paper, we will focus on the automatic detection of gradual drifts, a special type of change, in which the cases of two models overlap during a period of time. The proposed algorithm relies on conformance checking metrics to carry out the automatic detection of the changes, performing also a fully automatic classification of these changes into sudden or gradual. The approach has been validated with a synthetic dataset consisting of 120 logs with different distributions of changes, getting better results in terms of detection and classification accuracy, delay and change region overlapping than the main state-of-the-art algorithms.
LGDec 19, 2025
Improving Cardiac Risk Prediction Using Data Generation TechniquesAlexandre Cabodevila, Pedro Gamallo-Fernandez, Juan C. Vidal et al.
Cardiac rehabilitation constitutes a structured clinical process involving multiple interdependent phases, individualized medical decisions, and the coordinated participation of diverse healthcare professionals. This sequential and adaptive nature enables the program to be modeled as a business process, thereby facilitating its analysis. Nevertheless, studies in this context face significant limitations inherent to real-world medical databases: data are often scarce due to both economic costs and the time required for collection; many existing records are not suitable for specific analytical purposes; and, finally, there is a high prevalence of missing values, as not all patients undergo the same diagnostic tests. To address these limitations, this work proposes an architecture based on a Conditional Variational Autoencoder (CVAE) for the synthesis of realistic clinical records that are coherent with real-world observations. The primary objective is to increase the size and diversity of the available datasets in order to enhance the performance of cardiac risk prediction models and to reduce the need for potentially hazardous diagnostic procedures, such as exercise stress testing. The results demonstrate that the proposed architecture is capable of generating coherent and realistic synthetic data, whose use improves the accuracy of the various classifiers employed for cardiac risk detection, outperforming state-of-the-art deep learning approaches for synthetic data generation.
MEFeb 2, 2024
Conditional Mean and Variance Estimation via \textit{k}-NN Algorithm with Automated Variance SelectionMarcos Matabuena, Juan C. Vidal, Oscar Hernan Madrid Padilla et al.
We introduce a novel \textit{k}-nearest neighbor (\textit{k}-NN) regression method for joint estimation of the conditional mean and variance. The proposed algorithm preserves the computational efficiency and manifold-learning capabilities of classical non-parametric \textit{k}-NN models, while integrating a data-driven variable selection step that improves empirical performance. By accurately estimating both conditional mean and variance regression functions, the method effectively reconstructs the conditional distribution and density functions for multiple families of scale-and-localization generative models. We show that our estimator can achieve fast convergence rates, and we derive practical rules for selecting the smoothing parameter~$k$ that enhance the precision of the algorithm in finite sample regimes. Extensive simulations for low, moderate and large-dimensional covariate spaces, together with a real-world biomedical application, demonstrate that the proposed method can consistently outperform the conventional \textit{k-NN} regression algorithm while being more interpretable in the model output.
CLDec 13, 2023
Prompting LLMs with content plans to enhance the summarization of scientific articlesAldan Creo, Manuel Lama, Juan C. Vidal
This paper presents novel prompting techniques to improve the performance of automatic summarization systems for scientific articles. Scientific article summarization is highly challenging due to the length and complexity of these documents. We conceive, implement, and evaluate prompting techniques that provide additional contextual information to guide summarization systems. Specifically, we feed summarizers with lists of key terms extracted from articles, such as author keywords or automatically generated keywords. Our techniques are tested with various summarization models and input texts. Results show performance gains, especially for smaller models summarizing sections separately. This evidences that prompting is a promising approach to overcoming the limitations of less powerful systems. Our findings introduce a new research direction of using prompts to aid smaller models.
LGDec 17, 2021
Embedding Graph Convolutional Networks in Recurrent Neural Networks for Predictive MonitoringEfrén Rama-Maneiro, Juan C. Vidal, Manuel Lama
Predictive monitoring of business processes is a subfield of process mining that aims to predict, among other things, the characteristics of the next event or the sequence of next events. Although multiple approaches based on deep learning have been proposed, mainly recurrent neural networks and convolutional neural networks, none of them really exploit the structural information available in process models. This paper proposes an approach based on graph convolutional networks and recurrent neural networks that uses information directly from the process model. An experimental evaluation on real-life event logs shows that our approach is more consistent and outperforms the current state-of-the-art approaches.
LGSep 24, 2020
Deep Learning for Predictive Business Process Monitoring: Review and BenchmarkEfrén Rama-Maneiro, Juan C. Vidal, Manuel Lama
Predictive monitoring of business processes is concerned with the prediction of ongoing cases on a business process. Lately, the popularity of deep learning techniques has propitiated an ever-growing set of approaches focused on predictive monitoring based on these techniques. However, the high disparity of process logs and experimental setups used to evaluate these approaches makes it especially difficult to make a fair comparison. Furthermore, it also difficults the selection of the most suitable approach to solve a specific problem. In this paper, we provide both a systematic literature review of approaches that use deep learning to tackle the predictive monitoring tasks. In addition, we performed an exhaustive experimental evaluation of 10 different approaches over 12 publicly available process logs.
AIJul 9, 2019
A Conformance Checking-based Approach for Drift Detection in Business ProcessesVíctor Gallego-Fontenla, Juan C. Vidal, Manuel Lama
Real life business processes change over time, in both planned and unexpected ways. The detection of these changes is crucial for organizations to ensure that the expected and the real behavior are as similar as possible. These changes over time are called concept drift and its detection is a big challenge in process mining since the inherent complexity of the data makes difficult distinguishing between a change and an anomalous execution. In this paper, we present C2D2 (Conformance Checking-based Drift Detection), a new approach to detect sudden control-flow changes in the process models from event traces. C2D2 combines discovery techniques with conformance checking methods to perform an offline detection. Our approach has been validated with a synthetic benchmarking dataset formed by 68 logs, showing an improvement in the accuracy while maintaining a minimum delay in the drift detection.
AIFeb 6, 2019
The FA Quantifier Fuzzification Mechanism: analysis of convergence and efficient implementationsFélix Díaz-Hermida, Marcos Matabuena, Juan C. Vidal
The fuzzy quantification model FA has been identified as one of the best behaved quantification models in several revisions of the field of fuzzy quantification. This model is, to our knowledge, the unique one fulfilling the strict Determiner Fuzzification Scheme axiomatic framework that does not induce the standard min and max operators. The main contribution of this paper is the proof of a convergence result that links this quantification model with the Zadeh's model when the size of the input sets tends to infinite. The convergence proof is, in any case, more general than the convergence to the Zadeh's model, being applicable to any quantitative quantifier. In addition, recent revisions papers have presented some doubts about the existence of suitable computational implementations to evaluate the FA model in practical applications. In order to prove that this model is not only a theoretical approach, we show exact algorithmic solutions for the most common linguistic quantifiers as well as an approximate implementation by means of Monte Carlo. Additionally, we will also give a general overview of the main properties fulfilled by the FA model, as a single compendium integrating the whole set of properties fulfilled by it has not been previously published.
AIMay 11, 2016
Characterizing Quantifier Fuzzification Mechanisms: a behavioral guide for practical applicationsF. Diaz-Hermida, M. Pereira-Fariña, Juan C. Vidal et al.
Important advances have been made in the fuzzy quantification field. Nevertheless, some problems remain when we face the decision of selecting the most convenient model for a specific application. In the literature, several desirable adequacy properties have been proposed, but theoretical limits impede quantification models from simultaneously fulfilling every adequacy property that has been defined. Besides, the complexity of model definitions and adequacy properties makes very difficult for real users to understand the particularities of the different models that have been presented. In this work we will present several criteria conceived to help in the process of selecting the most adequate Quantifier Fuzzification Mechanisms for specific practical applications. In addition, some of the best known well-behaved models will be compared against this list of criteria. Based on this analysis, some guidance to choose fuzzy quantification models for practical applications will be provided.
AINov 26, 2014
A Fuzzy Syllogistic Reasoning Schema for Generalized QuantifiersM. Pereira-Fariña, Juan C. Vidal, F. Díaz-Hermida et al.
In this paper, a new approximate syllogistic reasoning schema is described that expands some of the approaches expounded in the literature into two ways: (i) a number of different types of quantifiers (logical, absolute, proportional, comparative and exception) taken from Theory of Generalized Quantifiers and similarity quantifiers, taken from statistics, are considered and (ii) any number of premises can be taken into account within the reasoning process. Furthermore, a systematic reasoning procedure to solve the syllogism is also proposed, interpreting it as an equivalent mathematical optimization problem, where the premises constitute the constraints of the searching space for the quantifier in the conclusion.