LGApr 17

UniCon: Unified Framework for Efficient Contrastive Alignment via Kernels

arXiv:2604.1667870.42 citationsh-index: 2

AI Analysis

This work addresses the training speed bottleneck in contrastive learning for multimodal models, offering a more efficient alternative that maintains generality.

UniCon introduces a unified framework for contrastive alignment that replaces minibatch back-propagation with closed-form global solutions, achieving substantial efficiency gains while preserving strong empirical performance across synthetic, unimodal, multimodal, and zero-shot tasks.

Contrastive objectives power state-of-the-art multimodal models, but their training remains slow, relying on long stochastic optimization. We propose a Unified Framework for Efficient Contrastive Alignment via Kernels (UniCon), which spans linear and nonlinear encoders as well as one-to-one and many-to-many alignments. At its core, UniCon introduces the contrastive similarity weight matrix $S(γ)$, which enables closed-form global solutions that provably replace minibatch back-propagation with exact updates. Through the lens of reproducing kernel Hilbert spaces (RKHS), UniCon provides a kernelized perspective that unifies contrastive alignment and reveals its connection to spectral methods. To validate the theory, we conduct experiments on synthetic, unimodal, multimodal, and zero-shot tasks, demonstrating that UniCon achieves substantial efficiency gains while preserving generality and strong empirical performance.

View on arXiv PDF

Similar