CVJul 6, 2022
Gaze-Vergence-Controlled See-Through Vision in Augmented RealityZhimin Wang, Yuxin Zhao, Feng Lu
Augmented Reality (AR) see-through vision is an interesting research topic since it enables users to see through a wall and see the occluded objects. Most existing research focuses on the visual effects of see-through vision, while the interaction method is less studied. However, we argue that using common interaction modalities, e.g., midair click and speech, may not be the optimal way to control see-through vision. This is because when we want to see through something, it is physically related to our gaze depth/vergence and thus should be naturally controlled by the eyes. Following this idea, this paper proposes a novel gaze-vergence-controlled (GVC) see-through vision technique in AR. Since gaze depth is needed, we build a gaze tracking module with two infrared cameras and the corresponding algorithm and assemble it into the Microsoft HoloLens 2 to achieve gaze depth estimation. We then propose two different GVC modes for see-through vision to fit different scenarios. Extensive experimental results demonstrate that our gaze depth estimation is efficient and accurate. By comparing with conventional interaction modalities, our GVC techniques are also shown to be superior in terms of efficiency and more preferred by users. Finally, we present four example applications of gaze-vergence-controlled see-through vision.
SEJan 8
AdaptEval: A Benchmark for Evaluating Large Language Models on Code Snippet AdaptationTanghaoran Zhang, Xinjun Mao, Shangwen Wang et al.
Recent advancements in large language models (LLMs) have automated various software engineering tasks, with benchmarks emerging to evaluate their capabilities. However, for adaptation, a critical activity during code reuse, there is no benchmark to assess LLMs' performance, leaving their practical utility in this area unclear. To fill this gap, we propose AdaptEval, a benchmark designed to evaluate LLMs on code snippet adaptation. Unlike existing benchmarks, AdaptEval incorporates the following three distinctive features: First, Practical Context. Tasks in AdaptEval are derived from developers' practices, preserving rich contextual information from Stack Overflow and GitHub communities. Second, Multi-granularity Annotation. Each task is annotated with requirements at both task and adaptation levels, supporting the evaluation of LLMs across diverse adaptation scenarios. Third, Fine-grained Evaluation. AdaptEval includes a two-tier testing framework combining adaptation-level and function-level tests, which enables evaluating LLMs' performance across various individual adaptations. Based on AdaptEval, we conduct the first empirical study to evaluate six instruction-tuned LLMs and especially three reasoning LLMs on code snippet adaptation. Experimental results demonstrate that AdaptEval enables the assessment of LLMs' adaptation capabilities from various perspectives. It also provides critical insights into their current limitations, particularly their struggle to follow explicit instructions. We hope AdaptEval can facilitate further investigation and enhancement of LLMs' capabilities in code snippet adaptation, supporting their real-world applications.
49.4LOApr 13
Generalizing Unit Commitment Problem Solving via SAT-based DecouplingYuxin Zhao, Han Huang, Fangji Fu et al.
As the cornerstone of modern power systems, the Unit Commitment Problem (UC) is critical for ensuring operational security and economic efficiency in the ongoing global energy transition. However, existing UC studies typically propose specialized algorithms for specific variants and operational requirements, tightly coupling the algorithms to their target models and limiting their applicability to other variants. To address this issue, this paper proposes a method that uses SAT-based reduction to decouple the algorithm from the problem, which allows a single algorithm to solve multiple UC variants. By uniformly reducing all UC variants to SAT instances solvable by standard SAT solvers, this method makes the solving algorithm independent of the original UC variant, thus granting it broad applicability across diverse variants. Experimental results show that our method achieves better solution quality than specialized algorithms and demonstrates stronger generalizability. This work offers a fast and flexible framework for addressing newly emerging UC formulations in evolving power systems.
SENov 23, 2024
Instruct or Interact? Exploring and Eliciting LLMs' Capability in Code Snippet Adaptation Through Prompt EngineeringTanghaoran Zhang, Yue Yu, Xinjun Mao et al.
Code snippet adaptation is a fundamental activity in the software development process. Unlike code generation, code snippet adaptation is not a "free creation", which requires developers to tailor a given code snippet in order to fit specific requirements and the code context. Recently, large language models (LLMs) have confirmed their effectiveness in the code generation task with promising results. However, their performance on adaptation, a reuse-oriented and context-dependent code change prediction task, is still unclear. To bridge this gap, we conduct an empirical study to investigate the performance and issues of LLMs on the adaptation task. We first evaluate the adaptation performances of three popular LLMs and compare them to the code generation task. Our result indicates that their adaptation ability is weaker than generation, with a nearly 15% decrease on pass@1 and more context-related errors. By manually inspecting 200 cases, we further investigate the causes of LLMs' sub-optimal performance, which can be classified into three categories, i.e., Unclear Requirement, Requirement Misalignment and Context Misapplication. Based on the above empirical research, we propose an interactive prompting approach to eliciting LLMs' adaptation ability. Experimental result reveals that our approach greatly improve LLMs' adaptation performance. The best-performing Human-LLM interaction successfully solves 159 out of the 202 identified defects and improves the pass@1 and pass@5 by over 40% compared to the initial instruction-based prompt. Considering human efforts, we suggest multi-agent interaction as a trade-off, which can achieve comparable performance with excellent generalization ability. We deem that our approach could provide methodological assistance for autonomous code snippet reuse and adaptation with LLMs.
94.7LOApr 23
A general optimization solver based on OP-to-MaxSAT reductionYuxin Zhao, Han Huang, Zhifeng Hao
Optimization problems are fundamental in diverse fields, such as engineering, economics, and scientific computing. However, current algorithms are mostly designed for specific problem types and exhibit limited generality in solving multiple types of optimization problems. To enhance generality, we propose an automated reduction method named OP-to-MaxSAT reduction and a general optimization solver based on OP-to-MaxSAT reduction (GORED). GORED unifies the solving of multiple types of optimization problems by reducing the problems from optimization problems to MaxSAT instances in polynomial time and solving them using the state-of-the-art MaxSAT solver. The generality and solution quality of GORED are validated through experiments on 136 instances across 11 types of optimization problems. Experimental results demonstrate that GORED not only successfully solves a wide range of optimization problems but also yields solutions comparable in quality to those from existing methods, with no statistically significant differences observed. By introducing automated reduction, this work shifts the paradigm of optimization solvers from designing specialized algorithms for each problem type to employing a single algorithm for diverse problems. As a result, advances in this single algorithm can now drive progress in a wide range of optimization problems across various domains.
CVNov 28, 2025
MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning?Yuandong Wang, Yao Cui, Yuxin Zhao et al.
Recent advances in Vision-Language Models (VLMs) have achieved impressive progress in multimodal mathematical reasoning. Yet, how much visual information truly contributes to reasoning remains unclear. Existing benchmarks report strong overall performance but seldom isolate the role of the image modality, leaving open whether VLMs genuinely leverage visual understanding or merely depend on linguistic priors. To address this, we present MathSight, a university-level multimodal mathematical reasoning benchmark designed to disentangle and quantify the effect of visual input. Each problem includes multiple visual variants -- original, hand-drawn, photo-captured -- and a text-only condition for controlled comparison. Experiments on state-of-the-art VLMs reveal a consistent trend: the contribution of visual information diminishes with increasing problem difficulty. Remarkably, Qwen3-VL without any image input surpasses both its multimodal variants and GPT-5, underscoring the need for benchmarks like MathSight to advance genuine vision-grounded reasoning in future models.
AO-PHMar 27, 2025
Interpretable Cross-Sphere Multiscale Deep Learning Predicts ENSO Skilfully Beyond 2 YearsRixu Hao, Yuxin Zhao, Shaoqing Zhang et al.
El Niño-Southern Oscillation (ENSO) exerts global climate and societal impacts, but real-time prediction with lead times beyond one year remains challenging. Dynamical models suffer from large biases and uncertainties, while deep learning struggles with interpretability and multi-scale dynamics. Here, we introduce PTSTnet, an interpretable model that unifies dynamical processes and cross-scale spatiotemporal learning in an innovative neural-network framework with physics-encoding learning. PTSTnet produces interpretable predictions significantly outperforming state-of-the-art benchmarks with lead times beyond 24 months, providing physical insights into error propagation in ocean-atmosphere interactions. PTSTnet learns feature representations with physical consistency from sparse data to tackle inherent multi-scale and multi-physics challenges underlying ocean-atmosphere processes, thereby inherently enhancing long-term prediction skill. Our successful realizations mark substantial steps forward in interpretable insights into innovative neural ocean modelling.
LGFeb 15, 2020
Federated Neural Architecture SearchJinliang Yuan, Mengwei Xu, Yuxin Zhao et al.
To preserve user privacy while enabling mobile intelligence, techniques have been proposed to train deep neural networks on decentralized data. However, training over decentralized data makes the design of neural architecture quite difficult as it already was. Such difficulty is further amplified when designing and deploying different neural architectures for heterogeneous mobile platforms. In this work, we propose an automatic neural architecture search into the decentralized training, as a new DNN training paradigm called Federated Neural Architecture Search, namely federated NAS. To deal with the primary challenge of limited on-client computational and communication resources, we present FedNAS, a highly optimized framework for efficient federated NAS. FedNAS fully exploits the key opportunity of insufficient model candidate re-training during the architecture search process, and incorporates three key optimizations: parallel candidates training on partial clients, early dropping candidates with inferior performance, and dynamic round numbers. Tested on large-scale datasets and typical CNN architectures, FedNAS achieves comparable model accuracy as state-of-the-art NAS algorithm that trains models with centralized data, and also reduces the client cost by up to two orders of magnitude compared to a straightforward design of federated NAS.
LGJul 5, 2019
Gaussian Processes for Analyzing Positioned Trajectories in SportsYuxin Zhao, Feng Yin, Fredrik Gunnarsson et al.
Kernel-based machine learning approaches are gaining increasing interest for exploring and modeling large dataset in recent years. Gaussian process (GP) is one example of such kernel-based approaches, which can provide very good performance for nonlinear modeling problems. In this work, we first propose a grey-box modeling approach to analyze the forces in cross country skiing races. To be more precise, a disciplined set of kinetic motion model formulae is combined with data-driven Gaussian process regression model, which accounts for everything unknown in the system. Then, a modeling approach is proposed to analyze the kinetic flow of both individual and clusters of skiers. The proposed approaches can be generally applied to use cases where positioned trajectories and kinetic measurements are available. The proposed approaches are evaluated using data collected from the Falun Nordic World Ski Championships 2015, in particular the Men's cross country $4\times10$ km relay. Forces during the cross country skiing races are analyzed and compared. Velocity models for skiers at different competition stages are also evaluated. Finally, the comparisons between the grey-box and black-box approach are carried out, where the grey-box approach can reduce the predictive uncertainty by $30\%$ to $40\%$.
NEApr 21, 2018
Swarm Intelligence: Past, Present and FutureXin-She Yang, Suash Deb, Yuxin Zhao et al.
Many optimization problems in science and engineering are challenging to solve, and the current trend is to use swarm intelligence (SI) and SI-based algorithms to tackle such challenging problems. Some significant developments have been made in recent years, though there are still many open problems in this area. This paper provides a short but timely analysis about SI-based algorithms and their links with self-organization. Different characteristics and properties are analyzed here from both mathematical and qualitative perspectives. Future research directions are outlined and open questions are also highlighted.
OCApr 21, 2018
Global Convergence Analysis of the Flower Pollination Algorithm: A Discrete-Time Markov Chain ApproachXingshi He, Xin-She Yang, Mehmet Karamanoglu et al.
Flower pollination algorithm is a recent metaheuristic algorithm for solving nonlinear global optimization problems. The algorithm has also been extended to solve multiobjective optimization with promising results. In this work, we analyze this algorithm mathematically and prove its convergence properties by using Markov chain theory. By constructing the appropriate transition probability for a population of flower pollen and using the homogeneity property, it can be shown that the constructed stochastic sequences can converge to the optimal set. Under the two proper conditions for convergence, it is proved that the simplified flower pollination algorithm can indeed satisfy these convergence conditions and thus the global convergence of this algorithm can be guaranteed. Numerical experiments are used to demonstrate that the flower pollination algorithm can converge quickly in practice and can thus achieve global optimality efficiently.