AIMar 1Code
DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant AdvantageHaowen Gao, Zhenyu Zhang, Liang Pang et al.
Reinforcement learning (RL) with group relative policy optimization (GRPO) has become a widely adopted approach for enhancing the reasoning capabilities of multimodal large language models (MLLMs). While GRPO enables long-chain reasoning without a critic, it often suffers from sparse rewards on difficult problems and advantage vanishing when group-level rewards are too consistent for overly easy or hard problems. Existing solutions (sample expansion, selective utilization, and indirect reward design) often fail to maintain enough variance in within-group reward distributions to yield clear optimization signals. To address this, we propose DIVA-GRPO, a difficulty-adaptive variant advantage method that adjusts variant difficulty distributions from a global perspective. DIVA-GRPO dynamically assesses problem difficulty, samples variants with appropriate difficulty levels, and calculates advantages across local and global groups using difficulty-weighted and normalized scaling. This alleviates reward sparsity and advantage vanishing while improving training stability. Extensive experiments on six reasoning benchmarks demonstrate that DIVA-GRPO outperforms existing approaches in training efficiency and reasoning performance. Code: https://github.com/Siaaaaaa1/DIVA-GRPO
SIApr 27
Skyline Community Search over Edge-Attributed Bipartite GraphsFangda Guo, Xuanpu Luo, Shiyuan Xu et al.
Bipartite graphs, modeling relationships between two types of entities, are widely used in practical applications. Community search, a fundamental problem in bipartite graphs, has gained significant attention. However, existing studies focus on measuring structural cohesiveness between vertex sets while either ignoring edge attributes or considering only one-dimensional importance. In this paper, we introduce a novel community model, named edge-attributed skyline community (ESC), which preserves structural cohesiveness and captures the inherent dominance of multi-dimensional edge attributes in bipartite graphs. To search for ESCs, we developed an efficient peeling algorithm that iteratively deletes edges with the minimum attribute in each dimension. Additionally, we devised an expanding algorithm to reduce the search space and speed up the filtering of unpromising vertices using a proven upper bound. Extensive experiments on large-scale real-world datasets demonstrate the efficiency, effectiveness, and scalability of our approach. A case study compared with prior arts demonstrates that our design improves the precision and diversity of results.
IRFeb 11, 2025Code
Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated VideosHaowen Gao, Liang Pang, Shicheng Xu et al.
With the rapid development of AI-generated content (AIGC), the creation of high-quality AI-generated videos has become faster and easier, resulting in the Internet being flooded with all kinds of video content. However, the impact of these videos on the content ecosystem remains largely unexplored. Video information retrieval remains a fundamental approach for accessing video content. Building on the observation that retrieval models often favor AI-generated content in ad-hoc and image retrieval tasks, we investigate whether similar biases emerge in the context of challenging video retrieval, where temporal and visual factors may further influence model behavior. To explore this, we first construct a comprehensive benchmark dataset containing both real and AI-generated videos, along with a set of fair and rigorous metrics to assess bias. This benchmark consists of 13,000 videos generated by two state-of-the-art open-source video generation models. We meticulously design a suite of rigorous metrics to accurately measure this preference, accounting for potential biases arising from the limited frame rate and suboptimal quality of AIGC videos. We then applied three off-the-shelf video retrieval models to perform retrieval tasks on this hybrid dataset. Our findings reveal a clear preference for AI-generated videos in retrieval. Further investigation shows that incorporating AI-generated videos into the training set of retrieval models exacerbates this bias. Unlike the preference observed in image modalities, we find that video retrieval bias arises from both unseen visual and temporal information, making the root causes of video bias a complex interplay of these two factors. To mitigate this bias, we fine-tune the retrieval models using a contrastive learning approach. The results of this study highlight the potential implications of AI-generated videos on retrieval systems.