Marwan Hassani

h-index18

6papers

32citations

Novelty45%

AI Score44

Ranked #48,264 of 194,257 authors (top 25%)#11,086 in LG (top 28%)

6 Papers

5.9LGApr 1Code

Chameleons do not Forget: Prompt-Based Online Continual Learning for Next Activity Prediction

Marwan Hassani, Tamara Verbeek, Sjoerd van Straten

Predictive process monitoring (PPM) focuses on predicting future process trajectories, including next activity predictions. This is crucial in dynamic environments where processes change or face uncertainty. However, current frameworks often assume a static environment, overlooking dynamic characteristics and concept drifts. This results in catastrophic forgetting, where training while focusing merely on new data distribution negatively impacts the performance on previously learned data distributions. Continual learning addresses, among others, the challenges related to mitigating catastrophic forgetting. This paper proposes a novel approach called Continual Next Activity Prediction with Prompts (CNAPwP), which adapts the DualPrompt algorithm for next activity prediction to improve accuracy and adaptability while mitigating catastrophic forgetting. We introduce new datasets with recurring concept drifts, alongside a task-specific forgetting metric that measures the prediction accuracy gap between initial occurrence and subsequent task occurrences. Extensive testing on three synthetic and two real-world datasets representing several setups of recurrent drifts shows that CNAPwP achieves SOTA or competitive results compared to five baselines, demonstrating its potential applicability in real-world scenarios. An open-source implementation of our method, together with the datasets and results, is available at: https://github.com/SvStraten/CNAPwP.

12.3LGSep 24, 2023

Topology-Agnostic Detection of Temporal Money Laundering Flows in Billion-Scale Transactions

Haseeb Tariq, Marwan Hassani

Money launderers exploit the weaknesses in detection systems by purposefully placing their ill-gotten money into multiple accounts, at different banks. That money is then layered and moved around among mule accounts to obscure the origin and the flow of transactions. Consequently, the money is integrated into the financial system without raising suspicion. Path finding algorithms that aim at tracking suspicious flows of money usually struggle with scale and complexity. Existing community detection techniques also fail to properly capture the time-dependent relationships. This is particularly evident when performing analytics over massive transaction graphs. We propose a framework (called FaSTMAN), adapted for domain-specific constraints, to efficiently construct a temporal graph of sequential transactions. The framework includes a weighting method, using 2nd order graph representation, to quantify the significance of the edges. This method enables us to distribute complex queries on smaller and densely connected networks of flows. Finally, based on those queries, we can effectively identify networks of suspicious flows. We extensively evaluate the scalability and the effectiveness of our framework against two state-of-the-art solutions for detecting suspicious flows of transactions. For a dataset of over 1 Billion transactions from multiple large European banks, the results show a clear superiority of our framework both in efficiency and usefulness.

1.8LGOct 18, 2022

Clustering-based Aggregations for Prediction in Event Streams

Yorick Spenrath, Marwan Hassani, Boudewijn F. Van Dongen

Predicting the behaviour of shoppers provides valuable information for retailers, such as the expected spend of a shopper or the total turnover of a supermarket. The ability to make predictions on an individual level is useful, as it allows supermarkets to accurately perform targeted marketing. However, given the expected number of shoppers and their diverse behaviours, making accurate predictions on an individual level is difficult. This problem does not only arise in shopper behaviour, but also in various business processes, such as predicting when an invoice will be paid. In this paper we present CAPiES, a framework that focuses on this trade-off in an online setting. By making predictions on a larger number of entities at a time, we improve the predictive accuracy but at the potential cost of usefulness since we can say less about the individual entities. CAPiES is developed in an online setting, where we continuously update the prediction model and make new predictions over time. We show the existence of the trade-off in an experimental evaluation in two real-world scenarios: a supermarket with over 160 000 shoppers and a paint factory with over 171 000 invoices.

5.3LGDec 27, 2023Code

Enhancing Traffic Flow Prediction using Outlier-Weighted AutoEncoders: Handling Real-Time Changes

Himanshu Choudhary, Marwan Hassani

In today's urban landscape, traffic congestion poses a critical challenge, especially during outlier scenarios. These outliers can indicate abrupt traffic peaks, drops, or irregular trends, often arising from factors such as accidents, events, or roadwork. Moreover, Given the dynamic nature of traffic, the need for real-time traffic modeling also becomes crucial to ensure accurate and up-to-date traffic predictions. To address these challenges, we introduce the Outlier Weighted Autoencoder Modeling (OWAM) framework. OWAM employs autoencoders for local outlier detection and generates correlation scores to assess neighboring traffic's influence. These scores serve as a weighted factor for neighboring sensors, before fusing them into the model. This information enhances the traffic model's performance and supports effective real-time updates, a crucial aspect for capturing dynamic traffic patterns. OWAM demonstrates a favorable trade-off between accuracy and efficiency, rendering it highly suitable for real-world applications. The research findings contribute significantly to the development of more efficient and adaptive traffic prediction models, advancing the field of transportation management for the future. The code and datasets of our framework is publicly available under https://github.com/himanshudce/OWAM.

6.4SEDec 23, 2021Code

A Framework for Efficient Memory Utilization in Online Conformance Checking

Rashid Zaman, Marwan Hassani, Boudewijn F. van Dongen

Conformance checking (CC) techniques of the process mining field gauge the conformance of the sequence of events in a case with respect to a business process model, which simply put is an amalgam of certain behavioral relations or rules. Online conformance checking (OCC) techniques are tailored for assessing such conformance on streaming events. The realistic assumption of having a finite memory for storing the streaming events has largely not been considered by the OCC techniques. We propose three incremental approaches to reduce the memory consumption in prefix-alignment-based OCC techniques along with ensuring that we incur a minimum loss of the conformance insights. Our first proposed approach bounds the number of maximum states that constitute a prefix-alignment to be retained by any case in memory. The second proposed approach bounds the number of cases that are allowed to retain more than a single state, referred to as multi-state cases. Building on top of the two proposed approaches, our third approach further bounds the number of maximum states that the multi-state cases can retain. All these approaches forget the states in excess to their defined limits and retain a meaningful summary of them. Computing prefix-alignments in the future is then resumed for such cases from the current position contained in the summary. We highlight the superiority of all proposed approaches compared to a state of the art prefix-alignment-based OCC technique through experiments using real-life event data under a streaming setting. Our approaches substantially reduce memory consumption by up to 80% on average, while introducing a minor accuracy drop.

7.5LGOct 19, 2021Code

What Averages Do Not Tell -- Predicting Real Life Processes with Sequential Deep Learning

István Ketykó, Felix Mannhardt, Marwan Hassani et al.

Deep Learning is proven to be an effective tool for modeling sequential data as shown by the success in Natural Language, Computer Vision and Signal Processing. Process Mining concerns discovering insights on business processes from their execution data that are logged by supporting information systems. The logged data (event log) is formed of event sequences (traces) that correspond to executions of a process. Many Deep Learning techniques have been successfully adapted for predictive Process Mining that aims to predict process outcomes, remaining time, the next event, or even the suffix of running traces. Traces in Process Mining are multimodal sequences and very differently structured than natural language sentences or images. This may require a different approach to processing. So far, there has been little focus on these differences and the challenges introduced. Looking at suffix prediction as the most challenging of these tasks, the performance of Deep Learning models was evaluated only on average measures and for a small number of real-life event logs. Comparing the results between papers is difficult due to different pre-processing and evaluation strategies. Challenges that may be relevant are the skewness of trace-length distribution and the skewness of the activity distribution in real-life event logs. We provide an end-to-end framework which enables to compare the performance of seven state-of-the-art sequential architectures in common settings. Results show that sequence modeling still has a lot of room for improvement for majority of the more complex datasets. Further research and insights are required to get consistent performance not just in average measures but additionally over all the prefixes.