CVJan 23
A Step to Decouple Optimization in 3DGSRenjie Ding, Yaonan Wang, Min Liu et al.
3D Gaussian Splatting (3DGS) has emerged as a powerful technique for real-time novel view synthesis. As an explicit representation optimized through gradient propagation among primitives, optimization widely accepted in deep neural networks (DNNs) is actually adopted in 3DGS, such as synchronous weight updating and Adam with the adaptive gradient. However, considering the physical significance and specific design in 3DGS, there are two overlooked details in the optimization of 3DGS: (i) update step coupling, which induces optimizer state rescaling and costly attribute updates outside the viewpoints, and (ii) gradient coupling in the moment, which may lead to under- or over-effective regularization. Nevertheless, such a complex coupling is under-explored. After revisiting the optimization of 3DGS, we take a step to decouple it and recompose the process into: Sparse Adam, Re-State Regularization and Decoupled Attribute Regularization. Taking a large number of experiments under the 3DGS and 3DGS-MCMC frameworks, our work provides a deeper understanding of these components. Finally, based on the empirical analysis, we re-design the optimization and propose AdamW-GS by re-coupling the beneficial components, under which better optimization efficiency and representation effectiveness are achieved simultaneously.
14.7NAMar 27
Average block nonlinear Kaczmarz methods with adaptive momentum for nonlinear systems of equationsRenjie Ding, Dongling Wang, Jun Zou
The Kaczmarz method is widely recognized as an efficient iterative algorithm for solving large-scale linear systems, owing to its simplicity and low memory requirements. However, the development of its nonlinear extensions for solving large-scale nonlinear systems has seen limited progress. In this work, we introduce a new family of momentum-accelerated averaging block nonlinear Kaczmarz methods tailored for large-scale nonlinear systems and ill-posed problems. Our contributions are twofold: (1) We develop an adaptive strategy for selecting step sizes and momentum coefficients, leading to a new average block nonlinear Kaczmarz method with adaptive momentum (ABNKAm). This algorithm achieves high computational efficiency by requiring only minimal inner-product computations per iteration, which significantly reduces both arithmetic complexity and memory usage. (2) We establish rigorous convergence of the ABNKAm under mild assumptions, proving that the method converges exponentially to the unique solution nearest to the initial point. Moreover, under suitable conditions, we provide a theoretical justification of acceleration of the proposed ABNKAm with momentum. Extensive numerical experiments demonstrate that ABNKAm outperforms existing nonlinear Kaczmarz variants in terms of both iteration count and computational time, with particularly notable gains in large-scale problems.
AIOct 16, 2025
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function CallingJianghao Lin, Yuanyuan Shi, Xin Peng et al.
Large language models (LLMs) are increasingly demonstrating strong capabilities as autonomous agents, with function calling serving as a core mechanism for interaction with the environment. Meanwhile, inference scaling has become a cutting-edge technique to enhance LLM performance by allocating more computational resources during the inference process. However, current research on inference scaling primarily focuses on unstructured output generation tasks, leaving its application in structured outputs, like function calling, largely underexplored. To bridge this gap, we propose an inference scaling framework that combines fine-grained beam search with a process reward model, ToolPRM, which scores the internal steps of each single function call. To train ToolPRM, we construct the first fine-grained intra-call process supervision dataset, automatically annotated with function-masking techniques to provide step-level rewards for structured tool-use reasoning. Extensive experiments demonstrate that ToolPRM beats the coarse-grained and outcome reward models in terms of predictive accuracy, indicating its stronger capability in supervising the function calling inference process. Inference scaling technique equipped with ToolPRM also significantly improves the backbone model performance across various function calling tasks and benchmarks. More importantly, we reveal a key principle for applying inference scaling techniques to structured outputs: "explore more but retain less" due to the unrecoverability characteristics of structured function calling generation.
CLSep 27, 2025
PARL-MT: Learning to Call Functions in Multi-Turn Conversation with Progress AwarenessHuacan Chai, Zijie Cao, Maolin Ran et al.
Large language models (LLMs) have achieved impressive success in single-turn function calling, yet real-world applications such as travel planning or multi-stage data analysis typically unfold across multi-turn conversations. In these settings, LLMs must not only issue accurate function calls at each step but also maintain progress awareness, the ability to summarize past interactions and plan future actions to ensure coherent, long-horizon task execution. Existing approaches, however, either reduce multi-turn training to isolated single-turn samples, which neglects task-level planning, or employ end-to-end reinforcement learning (RL) that struggles with redundancy and lacks explicit integration of progress awareness. To overcome these limitations, we introduce PARL-MT, a framework that explicitly incorporates progress awareness into LLM training for multi-turn function calling. PARL-MT combines (i) a Progress Awareness Generation (PAG) pipeline, which automatically constructs datasets coupling conversation summaries with future task planning, and (ii) a Progress Awareness-Guided Reinforcement Learning (PAG-RL) algorithm, which integrates progress awareness into RL training to reduce contextual redundancy and improve alignment between local actions and global task completion. Empirical results on two public benchmarks demonstrate that PARL-MT significantly outperforms existing methods, highlighting the effectiveness of progress awareness in enabling robust and efficient multi-turn function calling.