Yi Gong

CV
h-index14
14papers
146citations
Novelty56%
AI Score56

14 Papers

29.5LGMar 14
Fronto-parietal and fronto-temporal EEG coherence as predictive neuromarkers of transcutaneous auricular vagus nerve stimulation response in treatment-resistant schizophrenia: A machine learning study

Yapeng Cui, Ruoxi Yun, Shumin Zhang et al.

Response variability limits the clinical utility of transcutaneous auricular vagus nerve stimulation (taVNS) for negative symptoms in treatment-resistant schizophrenia (TRS). This study aimed to develop an electroencephalography (EEG)-based machine learning (ML) model to predict individual response and explore associated neurophysiological mechanisms. We used ML to develop and validate predictive models based on pre-treatment EEG data features (power, coherence, and dynamic functional connectivity) from 50 TRS patients enrolled in the taVNS trial, within a nested cross-validation framework. Participants received 20 sessions of active or sham taVNS (n = 25 each) over two weeks, followed by a two-week follow-up. The prediction target was the percentage change in the positive and negative syndrome scale-factor score for negative symptoms (PANSS-FSNS) from baseline to post-treatment, with further evaluation of model specificity and neurophysiological relevance.The optimal model accurately predicted taVNS response in the active group, with predicted PANSS-FSNS changes strongly correlated with observed changes (r = 0.87, p < .001); permutation testing confirmed performance above chance (p < .001). Nine consistently retained features were identified, predominantly fronto-parietal and fronto-temporal coherence features. Negligible predictive performance in the sham group and failure to predict positive symptom change support the predictive specificity of this oscillatory signature for taVNS-related negative symptom improvement. Two coherence features within fronto-parietal-temporal networks showed post-taVNS changes significantly associated with symptom improvement, suggesting dual roles as predictors and potential therapeutic targets. EEG oscillatory neuromarkers enable accurate prediction of individual taVNS response in TRS, supporting mechanism-informed precision neuromodulation strategies.

CVNov 21, 2023
Hyb-NeRF: A Multiresolution Hybrid Encoding for Neural Radiance Fields

Yifan Wang, Yi Gong, Yuan Zeng

Recent advances in Neural radiance fields (NeRF) have enabled high-fidelity scene reconstruction for novel view synthesis. However, NeRF requires hundreds of network evaluations per pixel to approximate a volume rendering integral, making it slow to train. Caching NeRFs into explicit data structures can effectively enhance rendering speed but at the cost of higher memory usage. To address these issues, we present Hyb-NeRF, a novel neural radiance field with a multi-resolution hybrid encoding that achieves efficient neural modeling and fast rendering, which also allows for high-quality novel view synthesis. The key idea of Hyb-NeRF is to represent the scene using different encoding strategies from coarse-to-fine resolution levels. Hyb-NeRF exploits memory-efficiency learnable positional features at coarse resolutions and the fast optimization speed and local details of hash-based feature grids at fine resolutions. In addition, to further boost performance, we embed cone tracing-based features in our learnable positional encoding that eliminates encoding ambiguity and reduces aliasing artifacts. Extensive experiments on both synthetic and real-world datasets show that Hyb-NeRF achieves faster rendering speed with better rending quality and even a lower memory footprint in comparison to previous state-of-the-art methods.

CVMay 24, 2022
Collaborative 3D Object Detection for Automatic Vehicle Systems via Learnable Communications

Junyong Wang, Yuan Zeng, Yi Gong

Accurate detection of objects in 3D point clouds is a key problem in autonomous driving systems. Collaborative perception can incorporate information from spatially diverse sensors and provide significant benefits for improving the perception accuracy of autonomous driving systems. In this work, we consider that the autonomous vehicle uses local point cloud data and combines information from neighboring infrastructures through wireless links for cooperative 3D object detection. However, information sharing among vehicle and infrastructures in predefined communication schemes may result in communication congestion and/or bring limited performance improvement. To this end, we propose a novel collaborative 3D object detection framework that consists of three components: feature learning networks that map point clouds into feature maps; an efficient communication block that propagates compact and fine-grained query feature maps from vehicle to support infrastructures and optimizes attention weights between query and key to refine support feature maps; a region proposal network that fuses local feature maps and weighted support feature maps for 3D object detection. We evaluate the performance of the proposed framework using a synthetic cooperative dataset created in two complex driving scenarios: a roundabout and a T-junction. Experiment results and bandwidth usage analysis demonstrate that our approach can save communication and computation costs and significantly improve detection performance under different detection difficulties in all scenarios.

CVFeb 13, 2023
Learning to Scale Temperature in Masked Self-Attention for Image Inpainting

Xiang Zhou, Yuan Zeng, Yi Gong

Recent advances in deep generative adversarial networks (GAN) and self-attention mechanism have led to significant improvements in the challenging task of inpainting large missing regions in an image. These methods integrate self-attention mechanism in neural networks to utilize surrounding neural elements based on their correlation and help the networks capture long-range dependencies. Temperature is a parameter in the Softmax function used in the self-attention, and it enables biasing the distribution of attention scores towards a handful of similar patches. Most existing self-attention mechanisms in image inpainting are convolution-based and set the temperature as a constant, performing patch matching in a limited feature space. In this work, we analyze the artifacts and training problems in previous self-attention mechanisms, and redesign the temperature learning network as well as the self-attention mechanism to address them. We present an image inpainting framework with a multi-head temperature masked self-attention mechanism, which provides stable and efficient temperature learning and uses multiple distant contextual information for high quality image inpainting. In addition to improving image quality of inpainting results, we generalize the proposed model to user-guided image editing by introducing a new sketch generation method. Extensive experiments on various datasets such as Paris StreetView, CelebA-HQ and Places2 clearly demonstrate that our method not only generates more natural inpainting results than previous works both in terms of perception image quality and quantitative metrics, but also enables to help users to generate more flexible results that are related to their sketch guidance.

CLMar 2
ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs

Xunlei Chen, Jinyu Guo, Yuang Li et al.

Large language models (LLMs) have advanced to encompass extensive knowledge across diverse domains. Yet controlling what a LLMs should not know is important for ensuring alignment and thus safe use. However, effective unlearning in LLMs is difficult due to the fuzzy boundary between knowledge retention and forgetting. This challenge is exacerbated by entangled parameter spaces from continuous multi-domain training, often resulting in collateral damage, especially under aggressive unlearning strategies. Furthermore, the computational overhead required to optimize State-of-the-Art (SOTA) models with billions of parameters poses an additional barrier. In this work, we present ALTER, a lightweight unlearning framework for LLMs to address both the challenges of knowledge entanglement and unlearning efficiency. ALTER operates through two phases: (I) high entropy tokens are captured and learned via the shared A matrix in LoRA, followed by (II) an asymmetric LoRA architecture that achieves a specified forgetting objective by parameter isolation and unlearning tokens within the target subdomains. Serving as a new research direction for achieving unlearning via token-level isolation in the asymmetric framework. ALTER achieves SOTA performance on TOFU, WMDP, and MUSE benchmarks with over 95% forget quality and shows minimal side effects through preserving foundational tokens. By decoupling unlearning from LLMs' billion-scale parameters, this framework delivers excellent efficiency while preserving over 90% of model utility, exceeding baseline preservation rates of 47.8-83.6%.

28.7CLMar 31
MemRerank: Preference Memory for Personalized Product Reranking

Zhiyuan Peng, Xuyang Wu, Huaixiao Tou et al.

LLM-based shopping agents increasingly rely on long purchase histories and multi-turn interactions for personalization, yet naively appending raw history to prompts is often ineffective due to noise, length, and relevance mismatch. We propose MemRerank, a preference memory framework that distills user purchase history into concise, query-independent signals for personalized product reranking. To study this problem, we build an end-to-end benchmark and evaluation framework centered on an LLM-based \textbf{1-in-5} selection task, which measures both memory quality and downstream reranking utility. We further train the memory extractor with reinforcement learning (RL), using downstream reranking performance as supervision. Experiments with two LLM-based rerankers show that MemRerank consistently outperforms no-memory, raw-history, and off-the-shelf memory baselines, yielding up to \textbf{+10.61} absolute points in 1-in-5 accuracy. These results suggest that explicit preference memory is a practical and effective building block for personalization in agentic e-commerce systems.

SPNov 5, 2021Code
Learning of Time-Frequency Attention Mechanism for Automatic Modulation Recognition

Shangao Lin, Yuan Zeng, Yi Gong

Recent learning-based image classification and speech recognition approaches make extensive use of attention mechanisms to achieve state-of-the-art recognition power, which demonstrates the effectiveness of attention mechanisms. Motivated by the fact that the frequency and time information of modulated radio signals are crucial for modulation mode recognition, this paper proposes a time-frequency attention mechanism for a convolutional neural network (CNN)-based modulation recognition framework. The proposed time-frequency attention module is designed to learn which channel, frequency and time information is more meaningful in CNN for modulation recognition. We analyze the effectiveness of the proposed time-frequency attention mechanism and compare the proposed method with two existing learning-based methods. Experiments on an open-source modulation recognition dataset show that the recognition performance of the proposed framework is better than those of the framework without time-frequency attention and existing learning-based methods.

CVNov 5, 2025
A Plug-and-Play Framework for Volumetric Light-Sheet Image Reconstruction

Yi Gong, Xinyuan Zhang, Jichen Chai et al.

Cardiac contraction is a rapid, coordinated process that unfolds across three-dimensional tissue on millisecond timescales. Traditional optical imaging is often inadequate for capturing dynamic cellular structure in the beating heart because of a fundamental trade-off between spatial and temporal resolution. To overcome these limitations, we propose a high-performance computational imaging framework that integrates Compressive Sensing (CS) with Light-Sheet Microscopy (LSM) for efficient, low-phototoxic cardiac imaging. The system performs compressed acquisition of fluorescence signals via random binary mask coding using a Digital Micromirror Device (DMD). We propose a Plug-and-Play (PnP) framework, solved using the alternating direction method of multipliers (ADMM), which flexibly incorporates advanced denoisers, including Tikhonov, Total Variation (TV), and BM3D. To preserve structural continuity in dynamic imaging, we further introduce temporal regularization enforcing smoothness between adjacent z-slices. Experimental results on zebrafish heart imaging under high compression ratios demonstrate that the proposed method successfully reconstructs cellular structures with excellent denoising performance and image clarity, validating the effectiveness and robustness of our algorithm in real-world high-speed, low-light biological imaging scenarios.

7.0SPApr 30
Sensing-Assisted Channel Estimation for Flexible-Antenna Systems: A Unified Framework

Ruoxiao Cao, Wentao Yu, Zixin Wang et al.

Flexible-antenna systems, which use a small number of radio frequency (RF) chains to dynamically access a large set of candidate antenna locations, have emerged as a hardware-efficient architecture for 6G networks. Acquiring accurate channel state information (CSI) is critical for these systems, but it typically incurs a prohibitive pilot overhead that scales with the massive number of candidate locations. To address this bottleneck, we propose a unified sensing-assisted channel estimation framework tailored for flexible-antenna systems. It reduces the full CSI reconstruction problem to a consistent two-stage process: it first resolves the dominant DOAs from the uplink data symbols by exploiting the spatial geometry, requiring no dedicated sensing pilot, and then calibrates the associated path gains using a minimal number of calibration pilots. Building on this pipeline, we develop two Newton-MUSIC algorithms tailored to different propagation environments. For line-of-sight (LOS)-dominant environments with uncorrelated sources, we propose SOC-Newton-MUSIC, which leverages second-order covariance (SOC) for low-complexity DOA sensing. For non-line-of-sight (NLOS) environments with coherent multipath, where the number of sources may exceed the number of activated RF chains, we propose FOC-Newton-MUSIC, which exploits fourth-order cumulants (FOC) to restore source identifiability and structurally expand the available spatial degrees of freedom (DOFs) through a continuous difference co-array. In both cases, by reformulating the spatial spectrum search as a continuous optimization problem, we replace exhaustive dense grid searches with parallelized Newton refinements.

LGDec 14, 2023
Learning a Low-Rank Feature Representation: Achieving Better Trade-Off between Stability and Plasticity in Continual Learning

Zhenrong Liu, Yang Li, Yi Gong et al.

In continual learning, networks confront a trade-off between stability and plasticity when trained on a sequence of tasks. To bolster plasticity without sacrificing stability, we propose a novel training algorithm called LRFR. This approach optimizes network parameters in the null space of the past tasks' feature representation matrix to guarantee the stability. Concurrently, we judiciously select only a subset of neurons in each layer of the network while training individual tasks to learn the past tasks' feature representation matrix in low-rank. This increases the null space dimension when designing network parameters for subsequent tasks, thereby enhancing the plasticity. Using CIFAR-100 and TinyImageNet as benchmark datasets for continual learning, the proposed approach consistently outperforms state-of-the-art methods.

RODec 18, 2024
Energy-Efficient SLAM via Joint Design of Sensing, Communication, and Exploration Speed

Zidong Han, Ruibo Jin, Xiaoyang Li et al.

To support future spatial machine intelligence applications, lifelong simultaneous localization and mapping (SLAM) has drawn significant attentions. SLAM is usually realized based on various types of mobile robots performing simultaneous and continuous sensing and communication. This paper focuses on analyzing the energy efficiency of robot operation for lifelong SLAM by jointly considering sensing, communication and mechanical factors. The system model is built based on a robot equipped with a 2D light detection and ranging (LiDAR) and an odometry. The cloud point raw data as well as the odometry data are wirelessly transmitted to data center where real-time map reconstruction is realized based on an unsupervised deep learning based method. The sensing duration, transmit power, transmit duration and exploration speed are jointly optimized to minimize the energy consumption. Simulations and experiments demonstrate the performance of our proposed method.

AIDec 1, 2020
A Multi-intersection Vehicular Cooperative Control based on End-Edge-Cloud Computing

Mingzhi Jiang, Tianhao Wu, Zhe Wang et al.

Cooperative Intelligent Transportation Systems (C-ITS) will change the modes of road safety and traffic management, especially at intersections without traffic lights, namely unsignalized intersections. Existing researches focus on vehicle control within a small area around an unsignalized intersection. In this paper, we expand the control domain to a large area with multiple intersections. In particular, we propose a Multi-intersection Vehicular Cooperative Control (MiVeCC) to enable cooperation among vehicles in a large area with multiple unsignalized intersections. Firstly, a vehicular end-edge-cloud computing framework is proposed to facilitate end-edge-cloud vertical cooperation and horizontal cooperation among vehicles. Then, the vehicular cooperative control problems in the cloud and edge layers are formulated as Markov Decision Process (MDP) and solved by two-stage reinforcement learning. Furthermore, to deal with high-density traffic, vehicle selection methods are proposed to reduce the state space and accelerate algorithm convergence without performance degradation. A multi-intersection simulation platform is developed to evaluate the proposed scheme. Simulation results show that the proposed MiVeCC can improve travel efficiency at multiple intersections by up to 4.59 times without collision compared with existing methods.

SPAug 14, 2019
Monthly electricity consumption forecasting by the fruit fly optimization algorithm enhanced Holt-Winters smoothing method

Weiheng Jiang, Xiaogang Wu, Yi Gong et al.

The electricity consumption forecasting is a critical component of the intelligent power system. And accurate monthly electricity consumption forecasting, as one of the the medium and long term electricity consumption forecasting problems, plays an important role in dispatching and management for electric power systems. Although there are many studies for this problem, large sample data set is generally required to obtain higher prediction accuracy, and the prediction performance become worse when only a little data is available. However, in practical, mostly we experience the problem of insufficient sample data and how to accurately forecast the monthly electricity consumption with limited sample data is a challenge task. The Holt-Winters exponential smoothing method often used to forecast periodic series due to low demand for training data and high accuracy for forecasting. In this paper, based on Holt-Winters exponential smoothing method, we propose a hybrid forecasting model named FOA-MHW. The main idea is that, we use fruit fly optimization algorithm to select smoothing parameters for Holt-Winters exponential smoothing method. Besides, electricity consumption data of a city in China is used to comprehensively evaluate the forecasting performance of the proposed model. The results indicate that our model can significantly improve the accuracy of monthly electricity consumption forecasting even in the case that only a small number of training data is available.

CRNov 13, 2017
Multilayer Nonlinear Processing for Information Privacy in Sensor Networks

Xin He, Meng Sun, Wee Peng Tay et al.

A sensor network wishes to transmit information to a fusion center to allow it to detect a public hypothesis, but at the same time prevent it from inferring a private hypothesis. We propose a multilayer nonlinear processing procedure at each sensor to distort the sensor's data before it is sent to the fusion center. In our proposed framework, sensors are grouped into clusters, and each sensor first applies a nonlinear fusion function on the information it receives from sensors in the same cluster and in a previous layer. A linear weighting matrix is then used to distort the information it sends to sensors in the next layer. We adopt a nonparametric approach and develop a modified mirror descent algorithm to optimize the weighting matrices so as to ensure that the regularized empirical risk of detecting the private hypothesis is above a given privacy threshold, while minimizing the regularized empirical risk of detecting the public hypothesis. Experiments on empirical datasets demonstrate that our approach is able to achieve a good trade-off between the error rates of the public and private hypothesis.