cs.LGComputer Science

Machine Learning

Statistical learning, deep learning, optimization

31.3CVMar 10Code77

MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data

Zongxia Li, Hongyang Du, Chengsong Huang et al.

This work addresses the challenge of self-evolving multimodal models for AI researchers, offering a scalable approach beyond existing two-model paradigms.

37.6LGApr 16

$π_{0.7}$: a Steerable Generalist Robotic Foundation Model with Emergent Capabilities

Physical Intelligence, Bo Ai, Ali Amin et al. · mit

For roboticists, π0.7 provides a generalist model that reduces the need for task-specific fine-tuning, enabling broad applicability across platforms and tasks.

29.6LGMar 17

The Finetuner's Fallacy: When to Pretrain with Your Finetuning Data

Christina Baek, Ricardo Pio Monti, David Schwab et al.

This addresses the challenge for practitioners in efficiently adapting models to narrow domains with scarce data, offering an incremental improvement over standard finetuning methods.

31.9LGMar 16Code24

Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning

Shubham Parashar, Shurui Gui, Xiner Li et al.

This addresses the challenge of inefficient reasoning improvement in small LLMs for mathematical and coding tasks, representing an incremental advancement in RL-based training methods.

29.3LGMay 2Code

NoiseRater: Meta-Learned Noise Valuation for Diffusion Model Training

Fang Wu, Haokai Zhao, Da Xing et al.

This work addresses the underexplored problem of noise valuation for diffusion model training, offering a method to improve training efficiency and generation quality.

30.5SEMar 26

Composer 2 Technical Report

Cursor Research, Aaron Chan, Ahmed Shalaby et al. · berkeley, microsoft-research

This addresses the need for efficient coding models in software engineering, though it appears incremental as it builds on previous Composer models.

26.1LGMar 11Code33

Meta-Reinforcement Learning with Self-Reflection for Agentic Search

Teng Xiao, Yige Yuan, Hamish Ivison et al.

This addresses the challenge of inefficient exploration in search agents, offering a domain-specific incremental improvement.

31.6SEApr 16

Scaling Test-Time Compute for Agentic Coding

Joongwon Kim, Wannan Yang, Kelvin Niu et al.

For developers of coding agents, this work addresses the bottleneck of scaling test-time compute for long-horizon tasks by focusing on representation and reuse of prior experience.

23.2LGMar 16Code19k

Mamba-3: Improved Sequence Modeling using State Space Principles

Aakash Lahoti, Kevin Y. Li, Berlin Chen et al.

This addresses the inference efficiency bottleneck for LLM deployment, representing a significant but incremental advance over prior sub-quadratic models.

46.9CVJun 1Code11k

Cosmos 3: Omnimodal World Models for Physical AI

Aditi, Niket Agarwal, Arslan Ali et al.

This work provides a scalable, general-purpose backbone for embodied agents by unifying multiple modalities into a single framework, which is a significant step for Physical AI research.

26.8LGMar 18Code48

Procedural Generation of Algorithm Discovery Tasks in Machine Learning

Alexander D. Goldie, Zilin Wang, Adrian Hayler et al.

This work addresses the need for better evaluation tools for researchers developing automated machine learning algorithms, though it is incremental as it builds on procedural generation concepts from reinforcement learning.

21.8LGMar 16

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence

Marc Finzi, Shikai Qiu, Yiding Jiang et al. · openai

This work addresses foundational issues in information theory for machine learning practitioners, offering a new framework for data selection and transformation, though it is incremental in building on existing concepts.

22.3LGMar 20Code

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Chiyu Ma, Shuo Yang, Kexin Huang et al.

This addresses the challenge of enhancing deep reasoning in AI models, offering a significant but incremental improvement over existing methods.

18.2LGMay 28

Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't

Anej Svete, William Merrill, Ryan Cotterell et al.

This work provides a more robust and exact characterization of transformer expressivity for researchers and practitioners interested in the theoretical capabilities of these models, clarifying which architectural choices are critical.

26.0AIMar 12Code333

Efficient Reasoning with Balanced Thinking

Yulin Li, Tengyao Tu, Li Ding et al.

This addresses inefficiencies in LRMs for resource-constrained deployment, offering a plug-and-play solution, though it is incremental as it builds on existing methods to balance reasoning dynamics.

24.1AIMar 16

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

Jeonghye Kim, Xufang Luo, Minbeom Kim et al.

This provides insights for future reasoning model design, addressing a foundational issue in AI for researchers and developers.

23.5LGMar 10

ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning

Ruizhong Qiu, Hanqing Zeng, Yinglong Xia et al.

This addresses a critical bottleneck in parameter-efficient finetuning for large language models, offering a novel solution to improve model expressiveness.

28.3DCMar 13

ARL-Tangram: Unleash the Resource Efficiency in Agentic Reinforcement Learning

Bangjun Xiao, Yihao Zhao, Xiangwei Deng et al.

This work addresses resource management inefficiencies for cloud-based agentic RL systems, offering significant performance gains and cost savings, though it is incremental as it builds on existing frameworks with a novel orchestration approach.

14.1CLMar 19

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Zhuolin Yang, Zihan Liu, Yang Chen et al. · nvidia

This provides a more parameter-efficient solution for AI systems requiring high-level reasoning, though it builds incrementally on previous cascade RL approaches.

37.5LGMar 26Code

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Yicheng Zou, Dongsheng Zhu, Lin Zhu et al.

This work addresses the need for large-scale, specialized AI models in scientific fields like chemistry and life sciences, representing a significant scaling effort rather than an incremental improvement.