100.0QUANT-PHApr 8
Exponential quantum advantage in processing massive classical dataHaimeng Zhao, Alexander Zlokapa, Hartmut Neven et al.
This work establishes machine learning on classical data as a broad domain of quantum advantage, potentially impacting fields like bioinformatics and natural language processing, but it is foundational rather than incremental.
100.0CVApr 22
Image Generators are Generalist Vision LearnersValentin Gabeur, Shangbang Long, Songyou Peng et al.
This work suggests a potential paradigm shift in computer vision by positioning generative pretraining as a foundational approach for building generalist vision models that unify generation and understanding tasks.
100.0CRMar 16Code
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public CompetitionMateusz Dziemian, Maxwell Lin, Xiaohan Fu et al. · eth-zurich
This addresses a critical security threat for users of AI agents in high-stakes settings, revealing fundamental weaknesses in current models.
100.0CVMar 31Code
ResAdapt: Adaptive Resolution for Efficient Multimodal ReasoningHuanxuan Liao, Zhongtao Jiang, Yupu Hao et al.
This addresses efficiency bottlenecks in multimodal reasoning for AI researchers and practitioners, offering a novel method to improve performance under aggressive compression.
100.0LGMar 10Code
KernelSkill: A Multi-Agent Framework for GPU Kernel OptimizationQitong Sun, Jun Han, Tianlin Li et al.
This addresses GPU kernel optimization for AI systems, offering a more interpretable and efficient approach compared to prior LLM-based methods.
99.9CVMar 17
Demystifing Video ReasoningRuisi Wang, Zhongang Cai, Fanyi Pu et al.
This provides a systematic understanding of reasoning emergence in video generation models, potentially guiding future research to exploit these dynamics for AI intelligence.
100.0CLMar 15Code
Inference-time Alignment in Continuous SpaceYige Yuan, Teng Xiao, Li Yunfan et al.
This addresses the challenge of limited effectiveness in inference-time alignment for AI models, particularly when base policies are weak or candidate sets are small, offering a novel approach to improve performance in tasks like safety and mathematical reasoning.
100.0AIApr 14Code
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test TimeHaozhe Wang, Cong Wei, Weiming Ren et al.
For practitioners of text-to-image and image-editing generation, this work provides a more interpretable and effective reward model that enhances generator performance without requiring additional parameter updates at test time.
100.0AIMar 16Code
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training DataYuwen Du, Rui Ye, Shuo Tang et al.
This work democratizes frontier search agent research for the broader AI community by providing open-source data and models, addressing a bottleneck previously dominated by industrial giants.
99.9LGMar 25
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed ExperienceZichuan Lin, Feiyu Liu, Yijun Yang et al.
This addresses the challenge of efficient, high-performance mobile GUI automation without manual annotation for developers and users of autonomous agents.
100.0HCMay 16
Human-LLM Compound System for Scientific Ideation through Facet Recombination and Novelty EvaluationMarissa Radensky, Simra Shahid, Raymond Fok et al. · allen-ai, uw
For computer science researchers, Scideator offers a novel interactive system for generating and evaluating scientific ideas, but the evaluation is limited to a user study with no quantitative SOTA claims.
99.9GEO-PHMar 24
TRACE: A Multi-Agent System for Autonomous Physical Reasoning in SeismologicalFeng Liu, Jian Xu, Xin Cui et al.
This addresses the challenge of expert-dependent, non-reproducible analysis in seismology by enabling autonomous, physically grounded inference across different tectonic environments.
100.0AIMar 16
Understanding Reasoning in LLMs through Strategic Information Allocation under UncertaintyJeonghye Kim, Xufang Luo, Minbeom Kim et al.
This provides insights for future reasoning model design, addressing a foundational issue in AI for researchers and developers.
100.0CRMar 16Code
ClawWorm: Self-Propagating Attacks Across LLM Agent EcosystemsYihao Zhang, Zeming Wei, Xiaokun Luan et al.
This addresses critical security risks for users of interconnected multi-agent systems, exposing vulnerabilities that could lead to autonomous attacks without attacker intervention.
100.0CLApr 17
AgentV-RL: Scaling Reward Modeling with Agentic VerifierJiazheng Zhang, Ziche Fu, Zhiheng Xi et al.
For LLM reasoning in complex domains, this framework addresses error propagation and lack of external grounding in verifiers, offering a more reliable and interpretable assessment method.
99.9AIApr 2Code
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended DiscoveryAo Qu, Han Zheng, Zijian Zhou et al.
This addresses the need for more autonomous and efficient open-ended discovery in AI research, representing a novel method rather than an incremental improvement.
99.9CLApr 30Code
TiMem: Temporal-Hierarchical Memory Consolidation for Long-Horizon Conversational AgentsKai Li, Xuanqing Yu, Ziyi Ni et al.
For developers of conversational agents, TiMem addresses the problem of managing long interaction histories with a novel memory organization that improves accuracy and efficiency.
100.0SEApr 16
Scaling Test-Time Compute for Agentic CodingJoongwon Kim, Wannan Yang, Kelvin Niu et al.
For developers of coding agents, this work addresses the bottleneck of scaling test-time compute for long-horizon tasks by focusing on representation and reuse of prior experience.
99.9SEMar 26Code
WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web TestingFanheng Kong, Jingyuan Zhang, Yang Yue et al.
This addresses the need for reliable automated web testing in software development, particularly for end-to-end verification, but it is incremental as it builds on existing testing paradigms with a new benchmark.
100.0SDApr 13Code
Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and MusicSreyan Ghosh, Arushi Goel, Kaousheik Jayakumar et al.
This work advances open-source audio-language models for researchers and practitioners needing robust understanding of speech, sound, and music, with strong real-world generalization.