LGMLFeb 20, 2025

Challenges of Multi-Modal Coreset Selection for Depth Prediction

arXiv:2502.15834v1h-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses the need for specialized coreset selection methods in multimodal applications, but it is incremental as it adapts an existing technique without achieving new state-of-the-art results.

The paper tackled the problem of adapting coreset selection methods to multimodal data for depth prediction, finding that existing unimodal techniques face challenges in capturing inter-modal relationships, with experiments showing no concrete performance improvements.

Coreset selection methods are effective in accelerating training and reducing memory requirements but remain largely unexplored in applied multimodal settings. We adapt a state-of-the-art (SoTA) coreset selection technique for multimodal data, focusing on the depth prediction task. Our experiments with embedding aggregation and dimensionality reduction approaches reveal the challenges of extending unimodal algorithms to multimodal scenarios, highlighting the need for specialized methods to better capture inter-modal relationships.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes