Yangyang Zhao

h-index20

9papers

45citations

Novelty51%

AI Score53

Ranked #29,668 of 205,806 authors (top 14%)#6,205 in CL (top 19%)

9 Papers

LGNov 2, 2025

Motion-Robust Multimodal Fusion of PPG and Accelerometer Signals for Three-Class Heart Rhythm Classification

Yangyang Zhao, Matti Kaisti, Olli Lahdenoja et al.

Atrial fibrillation (AF) is a leading cause of stroke and mortality, particularly in elderly patients. Wrist-worn photoplethysmography (PPG) enables non-invasive, continuous rhythm monitoring, yet suffers from significant vulnerability to motion artifacts and physiological noise. Many existing approaches rely solely on single-channel PPG and are limited to binary AF detection, often failing to capture the broader range of arrhythmias encountered in clinical settings. We introduce RhythmiNet, a residual neural network enhanced with temporal and channel attention modules that jointly leverage PPG and accelerometer (ACC) signals. The model performs three-class rhythm classification: AF, sinus rhythm (SR), and Other. To assess robustness across varying movement conditions, test data are stratified by accelerometer-based motion intensity percentiles without excluding any segments. RhythmiNet achieved a 4.3% improvement in macro-AUC over the PPG-only baseline. In addition, performance surpassed a logistic regression model based on handcrafted HRV features by 12%, highlighting the benefit of multimodal fusion and attention-based learning in noisy, real-world clinical data.

6.0SEMar 17

SoK: Systematizing Software Artifacts Traceability via Associations, Techniques, and Applications

Zhifei Chen, Lata Yi, Liming Nie et al.

Software development relies heavily on traceability links between various software artifacts to ensure quality and facilitate maintenance. While automated traceability recovery techniques have advanced for different artifact pairs, the field remains fragmented with an incomplete overview of artifact associations, ambiguous linking techniques, and fragmented knowledge of application scenarios. To bridge these gaps, we conducted a systematic literature review on software traceability recovery to synthesize the linked artifacts, recovery tools, and usage scenarios across the traceability ecosystem. First, we constructed the first global artifacts traceability graph of 23 associations among 22 artifact types, exposing a severe research imbalance that heavily favors code-related links. Second, while recovery techniques are shifting toward deep semantic models, a reproducibility crisis persists (e.g., only 37% of studies released code); to address this, we provided a comprehensive evaluation framework including a technical decision map and standardized benchmarks. Finally, we quantified an industrial adoption gap (i.e., 95% of tools remain confined to academia) and proposed a role-centric framework to dynamically align artifact paths with concrete engineering activities. This review contributes a coherent knowledge framework for artifacts traceability research, identifies current trends, and provides directions for future work.

76.3LGMar 16

3DTCR: A Physics-Based Generative Framework for Vortex-Following 3D Reconstruction to Improve Tropical Cyclone Intensity Forecasting

Jun Liu, Xiaohui Zhong, Kai Zheng et al.

Tropical cyclone (TC) intensity forecasting remains challenging as current numerical and AI-based weather models fail to satisfactorily represent extreme TC structure and intensity. Although intensity time-series forecasting has achieved significant advances, it outputs intensity sequences rather than the three-dimensional inner-core fine-scale structure and physical mechanisms governing TC evolution. High-resolution numerical simulations can capture these features but remain computationally expensive and inefficient for large-scale operational applications. Here we present 3DTCR, a physics-based generative framework combining physical constraints with generative AI efficiency for 3D TC structure reconstruction. Trained on a six-year, 3-km-resolution moving-domain WRF dataset, 3DTCR enables region-adaptive vortex-following reconstruction using conditional Flow Matching(CFM), optimized via latent domain adaptation and two-stage transfer learning. The framework mitigates limitations imposed by low-resolution targets and over-smoothed forecasts, improving the representation of TC inner-core structure and intensity while maintaining track stability. Results demonstrate that 3DTCR outperforms the ECMWF high-resolution forecasting system (ECMWF-HRES) in TC intensity prediction at nearly all lead times up to 5 days and reduces the RMSE of maximum WS10M by 36.5% relative to its FuXi inputs. These findings highlight 3DTCR as a physics-based generative framework that efficiently resolves fine-scale structures at lower computational cost, which may offer a promising avenue for improving TC intensity forecasting.

67.9CVMay 11

Not Blind but Silenced: Rebalancing Vision and Language via Adversarial Counter-Commonsense Equilibrium

Qingxin Xiao, Peilin Zhao, Yangyang Zhao et al.

During MLLM decoding, attention often abnormally concentrates on irrelevant image tokens. While existing research dismisses this as invalid noise and forcibly redirects attention to compel focusing on key image information, we argue these tokens are critical carriers of visual and narrative logic, and such coercive corrections exacerbate visual-language imbalance. Adopting a "decoding-as-game" perspective, we reveal that hallucinations stem from an equilibrium imbalance between linguistic priors and visual information. We propose Adversarial Counter-Commonsense Equilibrium (ACE), a training-free framework that perturbs visual context via counter-commonsense patches. Leveraging the fact that authentic visual features remain stable under perturbation while hallucinations fluctuate, ACE implements a dynamic game decoding strategy. This approach precisely suppresses perturbation-sensitive priors while compensating for stable visual signals to restore balance. Extensive experiments demonstrate that ACE, as a plug-and-play strategy, enhances model trustworthiness with negligible inference overhead.

77.8CLApr 25

Bridging Reasoning and Action: Hybrid LLM-RL Framework for Efficient Cross-Domain Task-Oriented Dialogue

Yangyang Zhao, Linfan Dai, Li Cai et al.

Cross-domain task-oriented dialogue requires reasoning over implicit and explicit feasibility constraints while planning long-horizon, multi-turn actions. Large language models (LLMs) can infer such constraints but are unreliable over long horizons, while Reinforcement learning (RL) optimizes long-horizon behavior yet cannot recover constraints from raw dialogue. Naively coupling LLMs with RL is therefore brittle: unverified or unstructured LLM outputs can corrupt state representations and misguide policy learning. Motivated by this, we propose Verified LLM-Knowledge empowered RL (VLK-RL), a hybrid framework that makes LLM-derived constraint reasoning usable for RL. VLK-RL first elicits candidate constraints with an LLM and then verifies them via a dual-role cross-examination procedure to suppress hallucinations and cross-turn inconsistencies. The verified constraints are mapped into ontology-aligned slot-value representations, yielding a structured, constraint-aware state for RL policy optimization. Experiments across multiple benchmarks demonstrate that VLK-RL significantly improves generalization and robustness, outperforming strong single-model baselines on long-horizon tasks.

CLJun 4, 2025

An Efficient Task-Oriented Dialogue Policy: Evolutionary Reinforcement Learning Injected by Elite Individuals

Yangyang Zhao, Ben Niu, Libo Qin et al.

Deep Reinforcement Learning (DRL) is widely used in task-oriented dialogue systems to optimize dialogue policy, but it struggles to balance exploration and exploitation due to the high dimensionality of state and action spaces. This challenge often results in local optima or poor convergence. Evolutionary Algorithms (EAs) have been proven to effectively explore the solution space of neural networks by maintaining population diversity. Inspired by this, we innovatively combine the global search capabilities of EA with the local optimization of DRL to achieve a balance between exploration and exploitation. Nevertheless, the inherent flexibility of natural language in dialogue tasks complicates this direct integration, leading to prolonged evolutionary times. Thus, we further propose an elite individual injection mechanism to enhance EA's search efficiency by adaptively introducing best-performing individuals into the population. Experiments across four datasets show that our approach significantly improves the balance between exploration and exploitation, boosting performance. Moreover, the effectiveness of the EII mechanism in reducing exploration time has been demonstrated, achieving an efficient integration of EA and DRL on task-oriented dialogue policy tasks.

HCMay 5, 2023

Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization

Yangyang Zhao, Zhenyu Wang, Mehdi Dastani et al.

Training a dialogue policy using deep reinforcement learning requires a lot of exploration of the environment. The amount of wasted invalid exploration makes their learning inefficient. In this paper, we find and define an important reason for the invalid exploration: dead-ends. When a conversation enters a dead-end state, regardless of the actions taken afterward, it will continue in a dead-end trajectory until the agent reaches a termination state or maximum turn. We propose a dead-end resurrection (DDR) algorithm that detects the initial dead-end state in a timely and efficient manner and provides a rescue action to guide and correct the exploration direction. To prevent dialogue policies from repeatedly making the same mistake, DDR also performs dialogue data augmentation by adding relevant experiences containing dead-end states. We first validate the dead-end detection reliability and then demonstrate the effectiveness and generality of the method by reporting experimental results on several dialogue datasets from different domains.

CLDec 28, 2020

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

Yangyang Zhao, Zhenyu Wang, Zhenhua Huang

Dialogue policy learning based on reinforcement learning is difficult to be applied to real users to train dialogue agents from scratch because of the high cost. User simulators, which choose random user goals for the dialogue agent to train on, have been considered as an affordable substitute for real users. However, this random sampling method ignores the law of human learning, making the learned dialogue policy inefficient and unstable. We propose a novel framework, Automatic Curriculum Learning-based Deep Q-Network (ACL-DQN), which replaces the traditional random sampling method with a teacher policy model to realize the dialogue policy for automatic curriculum learning. The teacher model arranges a meaningful ordered curriculum and automatically adjusts it by monitoring the learning progress of the dialogue agent and the over-repetition penalty without any requirement of prior knowledge. The learning progress of the dialogue agent reflects the relationship between the dialogue agent's ability and the sampled goals' difficulty for sample efficiency. The over-repetition penalty guarantees the sampled diversity. Experiments show that the ACL-DQN significantly improves the effectiveness and stability of dialogue tasks with a statistically significant margin. Furthermore, the framework can be further improved by equipping with different curriculum schedules, which demonstrates that the framework has strong generalizability.

CRApr 10, 2018

PULP: Inner-process Isolation based on the Program Counter and Data Memory Address

Xiaojing Zhu, Mingyu Chen, Yangyang Zhao et al.

Plenty of in-process vulnerabilities are blamed on various out of bound memory accesses. Previous prevention methods are mainly based on software checking associated with performance overhead, while traditional hardware protection mechanisms only work for inter-process memory accesses. In this paper we propose a novel hardware based in-process isolation system called PULP (Protection by User Level Partition). PULP modifies processor core by associating program counter and virtual memory address to achieve in-process data isolation. PULP partitions the program into two distinct parts, one is reliable, called primary functions, and the other is unreliable, called secondary functions, the accessible memory range of which can be configured via APIs. PULP automatically checks the memory bound when executing load/store operations in secondary functions. A RISC-V based FPGA prototype is implementated and functional test shows that PULP can effectively prevent in-process bug, including the Heartbleed and other buffer overflow vulnerabilities, etc. The total runtime overhead of PULP is negligible, as there is no extra runtime overhead besides configuring the API. We run SPEC2006 to evaluate the average performance, considering the LIBC functions as secondary functions. Experimental timing results show that, running bzip2, mcf, and libquantum, PULP bears low runtime overhead (less than 0.1%). Analysis also shows that PULP can be used effectively to prevent the newest "Spectre" bug which threats nearly all out-of-order processors.