Harshil Kotamreddy

49.7IRMay 1

Multimodal Data Curation Through Ranked Retrieval

Pratyush Muthukumar, Harshil Kotamreddy, Sarah Amiraslani et al.

Shared embedding spaces are widely used for multimodal search and data curation. In practice, two problems often limit how well this works. First, embeddings can reflect modality more than meaning, so examples cluster by input type even when the underlying content matches. Second, the paired supervision used to train these spaces is often noisy. When we blend many heterogeneous, human-labeled datasets, these issues reinforce each other and degrade cross-modal retrieval. We present a framework that improves alignment by acting on both the training pairs and the embedding model. Symmetric Nucleus Subsampling (SNS) refines training pairs by trimming raw inputs and annotations to the portions that best support each other. Expert Embedding Engine (EEE) combines complementary embedding experts using a learned projection network, together with a bias-aware objective that reduces modality-driven separation in the embedding space. We demonstrate that this approach collapses the modality gap by over 90% on average vs base embedding experts and is a strong data curator, with datablends from our method outperforming stratified sampling and traditional curation baselines in downstream model performance.

LGJul 12, 2025

A Study of Value-Aware Eigenoptions

Harshil Kotamreddy, Marlos C. Machado

Options, which impose an inductive bias toward temporal and hierarchical structure, offer a powerful framework for reinforcement learning (RL). While effective in sequential decision-making, they are often handcrafted rather than learned. Among approaches for discovering options, eigenoptions have shown strong performance in exploration, but their role in credit assignment remains underexplored. In this paper, we investigate whether eigenoptions can accelerate credit assignment in model-free RL, evaluating them in tabular and pixel-based gridworlds. We find that pre-specified eigenoptions aid not only exploration but also credit assignment, whereas online discovery can bias the agent's experience too strongly and hinder learning. In the context of deep RL, we also propose a method for learning option-values under non-linear function approximation, highlighting the impact of termination conditions on performance. Our findings reveal both the promise and complexity of using eigenoptions, and options more broadly, to simultaneously support credit assignment and exploration in reinforcement learning.

Harshil Kotamreddy

2 Papers