CVAILGFeb 22

US-JEPA: A Joint Embedding Predictive Architecture for Medical Ultrasound

arXiv:2602.19322v12 citationsh-index: 32
Originality Incremental advance
AI Analysis

This work addresses noisy ultrasound imaging for medical diagnostics, presenting an incremental improvement over existing methods.

The authors tackled the challenge of representation learning in noisy ultrasound imaging by proposing US-JEPA, a self-supervised framework that uses a frozen teacher for stable latent targets, achieving competitive or superior performance in classification tasks on the UltraBench dataset.

Ultrasound (US) imaging poses unique challenges for representation learning due to its inherently noisy acquisition process. The low signal-to-noise ratio and stochastic speckle patterns hinder standard self-supervised learning methods relying on a pixel-level reconstruction objective. Joint-Embedding Predictive Architectures (JEPAs) address this drawback by predicting masked latent representations rather than raw pixels. However, standard approaches depend on hyperparameter-brittle and computationally expensive online teachers updated via exponential moving average. We propose US-JEPA, a self-supervised framework that adopts the Static-teacher Asymmetric Latent Training (SALT) objective. By using a frozen, domain-specific teacher to provide stable latent targets, US-JEPA decouples student-teacher optimization and pushes the student to expand upon the semantic priors of the teacher. In addition, we provide the first rigorous comparison of all publicly available state-of-the-art ultrasound foundation models on UltraBench, a public dataset benchmark spanning multiple organs and pathological conditions. Under linear probing for diverse classification tasks, US-JEPA achieves performance competitive with or superior to domain-specific and universal vision foundation model baselines. Our results demonstrate that masked latent prediction provides a stable and efficient path toward robust ultrasound representations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes