71.3AIJun 1
TriAlign: Towards Universal Truth Consistency in Personalized LLM AlignmentThi-Nhung Nguyen, Linhao Luo, Rollin Omari et al.
Personalized large language models adapt responses to users' preferences and social attributes, but can introduce substantial universal truth inconsistencies across social groups, where some groups systematically receive less accurate responses on objective tasks. Existing alignment methods either ignore personalization or mainly focus on subjective preference alignment, largely overlooking fairness and consistency in universal truths. To address this gap, we study Truth-Invariant Alignment (TIA), an alignment problem for personalized LLMs that aims to ensure universal truths remain consistent across social groups while preserving personalization. We propose TriAlign, the first offline multi-agent reinforcement learning (MARL) framework for TIA, where each social group is modeled as an agent interacting. TriAlign jointly optimizes universal truth accuracy, cross-group truth consistency, and personalization through a fairness-aware objective and an explicit inconsistency penalty. Experiments across diverse benchmarks demonstrate that TriAlign achieves a stronger balance among these three objectives than strong baselines, reducing universal truth disparities across social groups while improving both objective task performance and personalization quality.
LGJun 27, 2025Code
Mitigating Semantic Collapse in Generative Personalization with Test-Time Embedding AdjustmentAnh Bui, Trang Vu, Trung Le et al.
In this paper, we investigate the semantic collapsing problem in generative personalization, an under-explored topic where the learned visual concept ($V$) gradually shifts from its original textual meaning and comes to dominate other concepts in multi-concept input prompts. This issue not only reduces the semantic richness of complex input prompts like "a photo of $V$ wearing glasses and playing guitar" into simpler, less contextually rich forms such as "a photo of $V$" but also leads to simplified output images that fail to capture the intended concept. We identify the root cause as unconstrained optimisation, which allows the learned embedding $V$ to drift arbitrarily in the embedding space, both in direction and magnitude. To address this, we propose a simple yet effective training-free method that adjusts the magnitude and direction of pre-trained embedding at inference time, effectively mitigating the semantic collapsing problem. Our method is broadly applicable across different personalization methods and demonstrates significant improvements in text-image alignment in diverse use cases. Our code is anonymously published at https://github.com/tuananhbui89/Embedding-Adjustment
62.3CVMay 8
Adaptive Subspace Projection for Generative PersonalizationVan-Anh Nguyen, Anh Tuan Bui, Tamas Abraham et al.
Generative personalization often suffers from the semantic collapsing problem (SCP), where a learned personalized concept overpowers the rest of the text prompt, causing the model to ignore important contextual details. To address this, we first analyze the underlying cause, revealing that the semantic drift responsible for SCP is not random but is concentrated within a specific low-dimensional subspace. We also discover that the personalization process perturbs the embedding of the original base concept, making it an unstable reference point. Based on these insights, we introduce Test-time Embedding Adjustment with Adaptive Subspace Projection (AdaptSP), a training-free method that uses the stable, pre-trained embedding as an anchor. AdaptSP isolates the semantic drift and projects it onto the identified subspace, performing a precise adjustment that mitigates SCP while maintaining the subject identity. Our experiments show that this targeted approach significantly improves prompt fidelity and contextual alignment.
23.2CLApr 17
MCBench: A Multicontext Safety Assessment Benchmark for Omni Large Language ModelsManh Luong, Tamas Abraham, Junae Kim et al.
Existing multimodal safety benchmarks focus solely on visual inputs and cannot assess Omni Large Language Models (LLMs) that process vision, audio, and text. We introduce MCBench, a benchmark with 1196 scenarios spanning four safety categories that require integrating multiple modalities for accurate safety assessment. Each unsafe scenario is paired with a minimally different safe counterpart to assess model sensitivity. Our evaluations of state-of-the-art models reveal significant challenges. Omni LLMs struggle with subtle or non-physical risks but perform better when salient visual or acoustic cues are present. Analysis of reasoning traces shows that, although models can extract modality-specific information, they often fail to integrate these cues effectively for safety judgments. Our findings reveal that current Omni LLMs lack robust cross-modal reasoning in safety-critical settings, underscoring the need for improved architectures and training strategies for multimodal safety.
LGDec 15, 2023
Adversarial Robustness on Image Classification with $k$-meansRollin Omari, Junae Kim, Paul Montague
In this paper we explore the challenges and strategies for enhancing the robustness of $k$-means clustering algorithms against adversarial manipulations. We evaluate the vulnerability of clustering algorithms to adversarial attacks, emphasising the associated security risks. Our study investigates the impact of incremental attack strength on training, introduces the concept of transferability between supervised and unsupervised models, and highlights the sensitivity of unsupervised models to sample distributions. We additionally introduce and evaluate an adversarial training method that improves testing performance in adversarial scenarios, and we highlight the importance of various parameters in the proposed training method, such as continuous learning, centroid initialisation, and adversarial step-count.
NEOct 14, 2020
Analogical and Relational Reasoning with Spiking Neural NetworksRollin Omari, R. I. McKay, Tom Gedeon
Raven's Progressive Matrices have been widely used for measuring abstract reasoning and intelligence in humans. However for artificial learning systems, abstract reasoning remains a challenging problem. In this paper we investigate how neural networks augmented with biologically inspired spiking modules gain a significant advantage in solving this problem. To illustrate this, we first investigate the performance of our networks with supervised learning, then with unsupervised learning. Experiments on the RAVEN dataset show that the overall accuracy of our supervised networks surpass human-level performance, while our unsupervised networks significantly outperform existing unsupervised methods. Finally, our results from both supervised and unsupervised learning illustrate that, unlike their non-augmented counterparts, networks with spiking modules are able to extract and encode temporal features without any explicit instruction, do not heavily rely on training data, and generalise more readily to new problems. In summary, the results reported here indicate that artificial neural networks with spiking modules are well suited to solving abstract reasoning.