EGA: Adapting Frozen Encoders for Vector Search with Bounded Out-of-Distribution Degradation

arXiv:2605.0567428.3h-index: 7

AI Analysis

This work addresses the critical problem of out-of-distribution degradation in vector search systems for practitioners deploying frozen encoders, offering a principled solution that maintains performance on seen classes while preserving unseen-class accuracy.

EGA introduces a residual adapter for frozen vision encoders that prevents performance collapse on unseen classes during vector search, achieving 96.5% gradient-free triplets at convergence and the highest worst-case Label Precision on four out of five OOD benchmarks, with over 40-point improvement over baselines.

Vector search systems built on frozen vision encoders face queries from unseen classes at deployment, yet existing adapter training collapses under this shift: high-capacity adapters with global contrastive losses silently reassign unseen-class samples to wrong seen-class clusters, dropping worst-case Label Precision by over 40 points below the frozen baseline in our tests. We propose Euclidean Geodesic Alignment (EGA), a residual adapter that couples three principles: zero initialization, local triplet loss, and hypersphere projection. These collectively induce a self-limiting dynamic: triplets that already satisfy a small margin stop producing gradients, so the adapter automatically stops updating where the local geometry is already correct. Our experiments show that at convergence $96.5\%$ of triplets are gradient-free, leaving unseen-class regions largely untouched while still enabling full-capacity refinement of seen classes. Across five diverse out-of-distribution (OOD) benchmarks, EGA achieves the highest worst-case Label Precision on the four primary splits and a consistent improvement on the fifth. The design also transfers to stronger backbones in addition to CLIP, and we provide an analytical justification linking gradient sparsity to bounded OOD perturbation.

View on arXiv PDF

Similar