Amin Jalali

ML
h-index20
17papers
131citations
Novelty46%
AI Score52

17 Papers

3.8DBApr 7Code
Advancing Object-Centric Process Mining with Multi-Dimensional Data Operations

Shahrzad Khayatbashi, Najmeh Miri, Amin Jalali

Analyzing process data at varying levels of granularity is important to derive actionable insights and support informed decision-making. Object-Centric Event Data (OCED) enhances process mining by capturing interactions among events and multiple objects, leading to the discovery of more detailed and realistic yet complex process models. The lack of methods to adjust the granularity of the analysis prevents users from leveraging the full potential of Object-Centric Process Mining (OCPM). To address this gap, we propose four operations: drill-down, roll-up, unfold, and fold, which enable analysts to change the granularity of analysis when working with Object-Centric Event Logs (OCEL). These operations allow analysts to seamlessly transition between detailed and aggregated process models, facilitating the discovery of insights that require varying levels of abstraction. We formally define these operations and implement them in an open-source Python library. To validate their utility, we applied the approach to real-world OCEL data extracted from a learning management system, covering a four-year period and approximately 400 students, as a case of object-centric educational process mining. This case study shows significant improvements in the precision and fitness of the discovered models after applying the operations. In addition, we evaluate the scalability of the operators on large, publicly available OCELs derived from the Business Process Intelligence Challenge datasets, demonstrating that the operations remain computationally feasible on industrial-scale event logs. This approach can empower analysts to perform more flexible and comprehensive process exploration, unlocking actionable insights through flexible granularity adjustments.

AIJun 22, 2022
Object Type Clustering using Markov Directly-Follow Multigraph in Object-Centric Process Mining

Amin Jalali

Object-centric process mining is a new paradigm with more realistic assumptions about underlying data by considering several case notions, e.g., an order handling process can be analyzed based on order, item, package, and route case notions. Including many case notions can result in a very complex model. To cope with such complexity, this paper introduces a new approach to cluster similar case notions based on Markov Directly-Follow Multigraph, which is an extended version of the well-known Directly-Follow Graph supported by many industrial and academic process mining tools. This graph is used to calculate a similarity matrix for discovering clusters of similar case notions based on a threshold. A threshold tuning algorithm is also defined to identify sets of different clusters that can be discovered based on different levels of similarity. Thus, the cluster discovery will not rely on merely analysts' assumptions. The approach is implemented and released as a part of a python library, called processmining, and it is evaluated through a Purchase to Pay (P2P) object-centric event log file. Some discovered clusters are evaluated by discovering Directly Follow-Multigraph by flattening the log based on the clusters. The similarity between identified clusters is also evaluated by calculating the similarity between the behavior of the process models discovered for each case notion using inductive miner based on footprints conformance checking.

CVOct 6, 2022
Adversarial Lagrangian Integrated Contrastive Embedding for Limited Size Datasets

Amin Jalali, Minho Lee

Certain datasets contain a limited number of samples with highly various styles and complex structures. This study presents a novel adversarial Lagrangian integrated contrastive embedding (ALICE) method for small-sized datasets. First, the accuracy improvement and training convergence of the proposed pre-trained adversarial transfer are shown on various subsets of datasets with few samples. Second, a novel adversarial integrated contrastive model using various augmentation techniques is investigated. The proposed structure considers the input samples with different appearances and generates a superior representation with adversarial transfer contrastive training. Finally, multi-objective augmented Lagrangian multipliers encourage the low-rank and sparsity of the presented adversarial contrastive embedding to adaptively estimate the coefficients of the regularizers automatically to the optimum weights. The sparsity constraint suppresses less representative elements in the feature space. The low-rank constraint eliminates trivial and redundant components and enables superior generalization. The performance of the proposed model is verified by conducting ablation studies by using benchmark datasets for scenarios with small data samples.

HCDec 8, 2025
Graph-Based Learning of Spectro-Topographical EEG Representations with Gradient Alignment for Brain-Computer Interfaces

Prithila Angkan, Amin Jalali, Paul Hungler et al.

We present a novel graph-based learning of EEG representations with gradient alignment (GEEGA) that leverages multi-domain information to learn EEG representations for brain-computer interfaces. Our model leverages graph convolutional networks to fuse embeddings from frequency-based topographical maps and time-frequency spectrograms, capturing inter-domain relationships. GEEGA addresses the challenge of achieving high inter-class separability, which arises from the temporally dynamic and subject-sensitive nature of EEG signals by incorporating the center loss and pairwise difference loss. Additionally, GEEGA incorporates a gradient alignment strategy to resolve conflicts between gradients from different domains and the fused embeddings, ensuring that discrepancies, where gradients point in conflicting directions, are aligned toward a unified optimization direction. We validate the efficacy of our method through extensive experiments on three publicly available EEG datasets: BCI-2a, CL-Drive and CLARE. Comprehensive ablation studies further highlight the impact of various components of our model.

3.3CVMar 20
Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset Distribution Shifts

John Turnbull, Shivam Grover, Amin Jalali et al.

Deep learning models often struggle under natural distribution shifts, a common challenge in real-world deployments. Test-Time Adaptation (TTA) addresses this by adapting models during inference without labeled source data. We present the first evaluation of TTA methods for FER under natural domain shifts, performing cross-dataset experiments with widely used FER datasets. This moves beyond synthetic corruptions to examine real-world shifts caused by differing collection protocols, annotation standards, and demographics. Results show TTA can boost FER performance under natural shifts by up to 11.34\%. Entropy minimization methods such as TENT and SAR perform best when the target distribution is clean. In contrast, prototype adjustment methods like T3A excel under larger distributional distance scenarios. Finally, feature alignment methods such as SHOT deliver the largest gains when the target distribution is noisier than our source. Our cross-dataset analysis shows that TTA effectiveness is governed by the distributional distance and the severity of the natural shift across domains.

AIApr 24, 2025
AI-Enhanced Business Process Automation: A Case Study in the Insurance Domain Using Object-Centric Process Mining

Shahrzad Khayatbashi, Viktor Sjölind, Anders Granåker et al.

Recent advancements in Artificial Intelligence (AI), particularly Large Language Models (LLMs), have enhanced organizations' ability to reengineer business processes by automating knowledge-intensive tasks. This automation drives digital transformation, often through gradual transitions that improve process efficiency and effectiveness. To fully assess the impact of such automation, a data-driven analysis approach is needed - one that examines how traditional and AI-enhanced process variants coexist during this transition. Object-Centric Process Mining (OCPM) has emerged as a valuable method that enables such analysis, yet real-world case studies are still needed to demonstrate its applicability. This paper presents a case study from the insurance sector, where an LLM was deployed in production to automate the identification of claim parts, a task previously performed manually and identified as a bottleneck for scalability. To evaluate this transformation, we apply OCPM to assess the impact of AI-driven automation on process scalability. Our findings indicate that while LLMs significantly enhance operational capacity, they also introduce new process dynamics that require further refinement. This study also demonstrates the practical application of OCPM in a real-world setting, highlighting its advantages and limitations.

DBMar 13, 2025
OCPM$^2$: Extending the Process Mining Methodology for Object-Centric Event Data Extraction

Najmeh Miri, Shahrzad Khayatbashi, Jelena Zdravkovic et al.

Object-Centric Process Mining (OCPM) enables business process analysis from multiple perspectives. For example, an educational path can be examined from the viewpoints of students, teachers, and groups. This analysis depends on Object-Centric Event Data (OCED), which captures relationships between events and object types, representing different perspectives. Unlike traditional process mining techniques, extracting OCED minimizes the need for repeated log extractions when shifting the analytical focus. However, recording these complex relationships increases the complexity of the log extraction process. To address this challenge, this paper proposes a methodology for extracting OCED based on PM\inst{2}, a well-established process mining framework. Our approach introduces a structured framework that guides data analysts and engineers in extracting OCED for process analysis. We validate this framework by applying it in a real-world educational setting, demonstrating its effectiveness in extracting an Object-Centric Event Log (OCEL), which serves as the standard format for recording OCED, from a learning management system and an administrative grading system.

HCNov 16, 2025
Multi-Domain EEG Representation Learning with Orthogonal Mapping and Attention-based Fusion for Cognitive Load Classification

Prithila Angkan, Amin Jalali, Paul Hungler et al.

We propose a new representation learning solution for the classification of cognitive load based on Electroencephalogram (EEG). Our method integrates both time and frequency domains by first passing the raw EEG signals through the convolutional encoder to obtain the time domain representations. Next, we measure the Power Spectral Density (PSD) for all five EEG frequency bands and generate the channel power values as 2D images referred to as multi-spectral topography maps. These multi-spectral topography maps are then fed to a separate encoder to obtain the representations in frequency domain. Our solution employs a multi-domain attention module that maps these domain-specific embeddings onto a shared embedding space to emphasize more on important inter-domain relationships to enhance the representations for cognitive load classification. Additionally, we incorporate an orthogonal projection constraint during the training of our method to effectively increase the inter-class distances while improving intra-class clustering. This enhancement allows efficient discrimination between different cognitive states and aids in better grouping of similar states within the feature space. We validate the effectiveness of our model through extensive experiments on two public EEG datasets, CL-Drive and CLARE for cognitive load classification. Our results demonstrate the superiority of our multi-domain approach over the traditional single-domain techniques. Moreover, we conduct ablation and sensitivity analyses to assess the impact of various components of our method. Finally, robustness experiments on different amounts of added noise demonstrate the stability of our method compared to other state-of-the-art solutions.

LGOct 2, 2025
Learning Time-Series Representations by Hierarchical Uniformity-Tolerance Latent Balancing

Amin Jalali, Milad Soltany, Michael Greenspan et al.

We propose TimeHUT, a novel method for learning time-series representations by hierarchical uniformity-tolerance balancing of contrastive representations. Our method uses two distinct losses to learn strong representations with the aim of striking an effective balance between uniformity and tolerance in the embedding space. First, TimeHUT uses a hierarchical setup to learn both instance-wise and temporal information from input time-series. Next, we integrate a temperature scheduler within the vanilla contrastive loss to balance the uniformity and tolerance characteristics of the embeddings. Additionally, a hierarchical angular margin loss enforces instance-wise and temporal contrast losses, creating geometric margins between positive and negative pairs of temporal sequences. This approach improves the coherence of positive pairs and their separation from the negatives, enhancing the capture of temporal dependencies within a time-series sample. We evaluate our approach on a wide range of tasks, namely 128 UCR and 30 UAE datasets for univariate and multivariate classification, as well as Yahoo and KPI datasets for anomaly detection. The results demonstrate that TimeHUT outperforms prior methods by considerable margins on classification, while obtaining competitive results for anomaly detection. Finally, detailed sensitivity and ablation studies are performed to evaluate different components and hyperparameters of our method.

AIMar 20, 2021
Evaluating Perceived Usefulness and Ease of Use of CMMN and DCR

Amin Jalali

Case Management has been gradually evolving to support Knowledge-intensive business process management, which resulted in developing different modeling languages, e.g., Declare, Dynamic Condition Response (DCR), and Case Management Model and Notation (CMMN). A language will die if users do not accept and use it in practice - similar to extinct human languages. Thus, it is important to evaluate how users perceive languages to determine if there is a need for improvement. Although some studies have investigated how the process designers perceived Declare and DCR, there is a lack of research on how they perceive CMMN. Therefore, this study investigates how the process designers perceive the usefulness and ease of use of CMMN and DCR based on the Technology Acceptance Model. DCR is included to enable comparing the study result with previous ones. The study is performed by educating master level students with these languages over eight weeks by giving feedback on their assignments to reduce perceptions biases. The students' perceptions are collected through questionnaires before and after sending feedback on their final practice in the exam. Thus, the result shows how the perception of participants can change by receiving feedback - despite being well trained. The reliability of responses is tested using Cronbach's alpha, and the result indicates that both languages have an acceptable level for both perceived usefulness and ease of use.

MLNov 30, 2020
Persistent Reductions in Regularized Loss Minimization for Variable Selection

Amin Jalali

In the context of regularized loss minimization with polyhedral gauges, we show that for a broad class of loss functions (possibly non-smooth and non-convex) and under a simple geometric condition on the input data it is possible to efficiently identify a subset of features which are guaranteed to have zero coefficients in all optimal solutions in all problems with loss functions from said class, before any iterative optimization has been performed for the original problem. This procedure is standalone, takes only the data as input, and does not require any calls to the loss function. Therefore, we term this procedure as a persistent reduction for the aforementioned class of regularized loss minimization problems. This reduction can be efficiently implemented via an extreme ray identification subroutine applied to a polyhedral cone formed from the datapoints. We employ an existing output-sensitive algorithm for extreme ray identification which makes our guarantee and algorithm applicable in ultra-high dimensional problems.

MLApr 10, 2019
New Computational and Statistical Aspects of Regularized Regression with Application to Rare Feature Selection and Aggregation

Amin Jalali, Adel Javanmard, Maryam Fazel

Prior knowledge on properties of a target model often come as discrete or combinatorial descriptions. This work provides a unified computational framework for defining norms that promote such structures. More specifically, we develop associated tools for optimization involving such norms given only the orthogonal projection oracle onto the non-convex set of desired models. As an example, we study a norm, which we term the doubly-sparse norm, for promoting vectors with few nonzero entries taking only a few distinct values. We further discuss how the K-means algorithm can serve as the underlying projection oracle in this case and how it can be efficiently represented as a quadratically constrained quadratic program. Our motivation for the study of this norm is regularized regression in the presence of rare features which poses a challenge to various methods within high-dimensional statistics, and in machine learning in general. The proposed estimation procedure is designed to perform automatic feature selection and aggregation for which we develop statistical bounds. The bounds are general and offer a statistical framework for norm-based regularization. The bounds rely on novel geometric quantities on which we attempt to elaborate as well.

OCFeb 19, 2019
2-Wasserstein Approximation via Restricted Convex Potentials with Application to Improved Training for GANs

Amirhossein Taghvaei, Amin Jalali

We provide a framework to approximate the 2-Wasserstein distance and the optimal transport map, amenable to efficient training as well as statistical and geometric analysis. With the quadratic cost and considering the Kantorovich dual form of the optimal transportation problem, the Brenier theorem states that the optimal potential function is convex and the optimal transport map is the gradient of the optimal potential function. Using this geometric structure, we restrict the optimization problem to different parametrized classes of convex functions and pay special attention to the class of input-convex neural networks. We analyze the statistical generalization and the discriminative power of the resulting approximate metric, and we prove a restricted moment-matching property for the approximate optimal map. Finally, we discuss a numerical algorithm to solve the restricted optimization problem and provide numerical experiments to illustrate and compare the proposed approach with the established regularization-based approaches. We further discuss practical implications of our proposal in a modular and interpretable design for GANs which connects the generator training with discriminator computations to allow for learning an overall composite generator.

MLFeb 26, 2018
Missing Data in Sparse Transition Matrix Estimation for Sub-Gaussian Vector Autoregressive Processes

Amin Jalali, Rebecca Willett

High-dimensional time series data exist in numerous areas such as finance, genomics, healthcare, and neuroscience. An unavoidable aspect of all such datasets is missing data, and dealing with this issue has been an important focus in statistics, control, and machine learning. In this work, we consider a high-dimensional estimation problem where a dynamical system, governed by a stable vector autoregressive model, is randomly and only partially observed at each time point. Our task amounts to estimating the transition matrix, which is assumed to be sparse. In such a scenario, where covariates are highly interdependent and partially missing, new theoretical challenges arise. While transition matrix estimation in vector autoregressive models has been studied previously, the missing data scenario requires separate efforts. Moreover, while transition matrix estimation can be studied from a high-dimensional sparse linear regression perspective, the covariates are highly dependent and existing results on regularized estimation with missing data from i.i.d.~covariates are not applicable. At the heart of our analysis lies 1) a novel concentration result when the innovation noise satisfies the convex concentration property, as well as 2) a new quantity for characterizing the interactions of the time-varying observation process with the underlying dynamical system.

MLJul 8, 2017
Subspace Clustering with Missing and Corrupted Data

Zachary Charles, Amin Jalali, Rebecca Willett

Given full or partial information about a collection of points that lie close to a union of several subspaces, subspace clustering refers to the process of clustering the points according to their subspace and identifying the subspaces. One popular approach, sparse subspace clustering (SSC), represents each sample as a weighted combination of the other samples, with weights of minimal $\ell_1$ norm, and then uses those learned weights to cluster the samples. SSC is stable in settings where each sample is contaminated by a relatively small amount of noise. However, when there is a significant amount of additive noise, or a considerable number of entries are missing, theoretical guarantees are scarce. In this paper, we study a robust variant of SSC and establish clustering guarantees in the presence of corrupted or missing data. We give explicit bounds on amount of noise and missing data that the algorithm can tolerate, both in deterministic settings and in a random generative model. Notably, our approach provides guarantees for higher tolerance to noise and missing data than existing analyses for this method. By design, the results hold even when we do not know the locations of the missing data; e.g., as in presence-only data.

MLDec 15, 2015
Relative Density and Exact Recovery in Heterogeneous Stochastic Block Models

Amin Jalali, Qiyang Han, Ioana Dumitriu et al.

The Stochastic Block Model (SBM) is a widely used random graph model for networks with communities. Despite the recent burst of interest in recovering communities in the SBM from statistical and computational points of view, there are still gaps in understanding the fundamental information theoretic and computational limits of recovery. In this paper, we consider the SBM in its full generality, where there is no restriction on the number and sizes of communities or how they grow with the number of nodes, as well as on the connection probabilities inside or across communities. This generality allows us to move past the artifacts of homogenous SBM, and understand the right parameters (such as the relative densities of communities) that define the various recovery thresholds. We outline the implications of our generalizations via a set of illustrative examples. For instance, $\log n$ is considered to be the standard lower bound on the cluster size for exact recovery via convex methods, for homogenous SBM. We show that it is possible, in the right circumstances (when sizes are spread and the smaller the cluster, the denser), to recover very small clusters (up to $\sqrt{\log n}$ size), if there are just a few of them (at most polylogarithmic in $n$).

OCJul 16, 2015
Variational Gram Functions: Convex Analysis and Optimization

Amin Jalali, Maryam Fazel, Lin Xiao

We propose a new class of convex penalty functions, called \emph{variational Gram functions} (VGFs), that can promote pairwise relations, such as orthogonality, among a set of vectors in a vector space. These functions can serve as regularizers in convex optimization problems arising from hierarchical classification, multitask learning, and estimating vectors with disjoint supports, among other applications. We study convexity for VGFs, and give efficient characterizations for their convex conjugates, subdifferentials, and proximal operators. We discuss efficient optimization algorithms for regularized loss minimization problems where the loss admits a common, yet simple, variational representation and the regularizer is a VGF. These algorithms enjoy a simple kernel trick, an efficient line search, as well as computational advantages over first order methods based on the subdifferential or proximal maps. We also establish a general representer theorem for such learning problems. Lastly, numerical experiments on a hierarchical classification problem are presented to demonstrate the effectiveness of VGFs and the associated optimization algorithms.