CRJun 1
Ghost Tool Calls: Issue-Time Privacy for Speculative Agent ToolsBardia Mohammadi, Lars Klein, Akhil Arora et al.
Tool-augmented language agents speculatively issue likely future tool calls to hide latency, but those calls leak inferred user intent to external services before the agent commits to the branch. Every external observer that received the call retains the disclosure after the agent abandons the branch. Timing is the issue, not authorization: no commit-time cleanup, read-only restriction, or access-control allow-list unsends what an observer already holds. We call these invocations ghost tool calls and propose Speculative Tool Privacy Contracts, a runtime abstraction that treats observation before commitment as a first-class effect, distinct from state mutation. We implement the contracts in a prototype runtime and evaluate twelve policies across three corpora. Speculative dispatch increases what an observer can infer about user intent; post-hoc filters, read-only restrictions, and access-control allow-lists leave that inference intact; only issue-time policies that change or suppress the speculative call's argument or destination projection before dispatch reduce it.
IRJul 25, 2023
ClusterSeq: Enhancing Sequential Recommender Systems with Clustering based Meta-LearningMohammmadmahdi Maheri, Reza Abdollahzadeh, Bardia Mohammadi et al.
In practical scenarios, the effectiveness of sequential recommendation systems is hindered by the user cold-start problem, which arises due to limited interactions for accurately determining user preferences. Previous studies have attempted to address this issue by combining meta-learning with user and item-side information. However, these approaches face inherent challenges in modeling user preference dynamics, particularly for "minor users" who exhibit distinct preferences compared to more common or "major users." To overcome these limitations, we present a novel approach called ClusterSeq, a Meta-Learning Clustering-Based Sequential Recommender System. ClusterSeq leverages dynamic information in the user sequence to enhance item prediction accuracy, even in the absence of side information. This model preserves the preferences of minor users without being overshadowed by major users, and it capitalizes on the collective knowledge of users within the same cluster. Extensive experiments conducted on various benchmark datasets validate the effectiveness of ClusterSeq. Empirical results consistently demonstrate that ClusterSeq outperforms several state-of-the-art meta-learning recommenders. Notably, compared to existing meta-learning methods, our proposed approach achieves a substantial improvement of 16-39% in Mean Reciprocal Rank (MRR).
CLFeb 12, 2025Code
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented GenerationMohammad Mahdi Abootorabi, Amirhosein Zobeiri, Mahdi Dehghani et al.
Large Language Models (LLMs) suffer from hallucinations and outdated knowledge due to their reliance on static training data. Retrieval-Augmented Generation (RAG) mitigates these issues by integrating external dynamic information for improved factual grounding. With advances in multimodal learning, Multimodal RAG extends this approach by incorporating multiple modalities such as text, images, audio, and video to enhance the generated outputs. However, cross-modal alignment and reasoning introduce unique challenges beyond those in unimodal RAG. This survey offers a structured and comprehensive analysis of Multimodal RAG systems, covering datasets, benchmarks, metrics, evaluation, methodologies, and innovations in retrieval, fusion, augmentation, and generation. We review training strategies, robustness enhancements, loss functions, and agent-based approaches, while also exploring the diverse Multimodal RAG scenarios. In addition, we outline open challenges and future directions to guide research in this evolving field. This survey lays the foundation for developing more capable and reliable AI systems that effectively leverage multimodal dynamic external knowledge bases. All resources are publicly available at https://github.com/llm-lab-org/Multimodal-RAG-Survey.
LGFeb 16
Atomix: Timely, Transactional Tool Use for Reliable Agentic WorkflowsBardia Mohammadi, Nearchos Potamitis, Lars Klein et al.
LLM agents increasingly act on external systems, yet tool effects are immediate. Under failures, speculation, or contention, losing branches can leak unintended side effects with no safe rollback. We introduce Atomix, a runtime that provides progress-aware transactional semantics for agent tool calls. Atomix tags each call with an epoch, tracks per-resource frontiers, and commits only when progress predicates indicate safety; bufferable effects can be delayed, while externalized effects are tracked and compensated on abort. Across real workloads with fault injection, transactional retry improves task success, while frontier-gated commit strengthens isolation under speculation and contention.
DBJul 7, 2025
The Case for Instance-Optimized LLMs in OLAP DatabasesBardia Mohammadi, Laurent Bindschaedler
Large Language Models (LLMs) can enhance analytics systems with powerful data summarization, cleaning, and semantic transformation capabilities. However, deploying LLMs at scale -- processing millions to billions of rows -- remains prohibitively expensive in computation and memory. We present IOLM-DB, a novel system that makes LLM-enhanced database queries practical through query-specific model optimization. Instead of using general-purpose LLMs, IOLM-DB generates lightweight, specialized models tailored to each query's specific needs using representative data samples. IOLM-DB reduces model footprints by up to 76% and increases throughput by up to 3.31$\times$ while maintaining accuracy through aggressive compression techniques, including quantization, sparsification, and structural pruning. We further show how our approach enables higher parallelism on existing hardware and seamlessly supports caching and batching strategies to reduce overheads. Our prototype demonstrates that leveraging LLM queries inside analytics systems is feasible at scale, opening new possibilities for future OLAP applications.
SEJun 16, 2024
Predicting the Understandability of Computational Notebooks through Code Metrics AnalysisMojtaba Mostafavi Ghahfarokhi, Alireza Asadi, Arash Asgari et al.
Computational notebooks are the primary coding tools for data scientists, but their code quality remains understudied and often poor. Given the importance of maintainability and reusability, enhancing code understandability is essential. Traditional methods for assessing understandability typically rely on limited questionnaires or metadata like likes and votes, which may not reflect actual code clarity. To address this, we propose a novel approach that leverages user opinions from software repositories to assess the understandability of Jupyter notebooks. We conducted a case study using 542,051 Kaggle Jupyter notebooks compiled in the DistilKaggle dataset. To identify user comments related to code understandability, we used a fine-tuned DistilBERT transformer. We then introduced a new metric, i.e., User Opinion Code Understandability (UOCU), based on the number of relevant comments, their upvotes, and notebook views. UOCU proved significantly more effective than prior methods. We further enhanced it by combining UOCU with total upvotes in a hybrid approach. Using this improved metric, we collected 34 notebook-level metrics from 132,723 final notebooks and trained machine learning models to predict understandability. Our best model, a Random Forest classifier, achieved 89% accuracy in classifying the understandability level of notebook code. This work demonstrates the value of user opinion signals and notebook metrics in building scalable, accurate measures of code understandability.