Ye Xue

LG
h-index5
12papers
56citations
Novelty57%
AI Score48

12 Papers

CVJul 1, 2023
S-Omninet: Structured Data Enhanced Universal Multimodal Learning Architecture

Ye Xue, Diego Klabjan, Jean Utke

Multimodal multitask learning has attracted an increasing interest in recent years. Singlemodal models have been advancing rapidly and have achieved astonishing results on various tasks across multiple domains. Multimodal learning offers opportunities for further improvements by integrating data from multiple modalities. Many methods are proposed to learn on a specific type of multimodal data, such as vision and language data. A few of them are designed to handle several modalities and tasks at a time. In this work, we extend and improve Omninet, an architecture that is capable of handling multiple modalities and tasks at a time, by introducing cross-cache attention, integrating patch embeddings for vision inputs, and supporting structured data. The proposed Structured-data-enhanced Omninet (S-Omninet) is a universal model that is capable of learning from structured data of various dimensions effectively with unstructured data through cross-cache attention, which enables interactions among spatial, temporal, and structured features. We also enhance spatial representations in a spatial cache with patch embeddings. We evaluate the proposed model on several multimodal datasets and demonstrate a significant improvement over the baseline, Omninet.

SPJun 4, 2023
Riemannian Low-Rank Model Compression for Federated Learning with Over-the-Air Aggregation

Ye Xue, Vincent Lau

Low-rank model compression is a widely used technique for reducing the computational load when training machine learning models. However, existing methods often rely on relaxing the low-rank constraint of the model weights using a regularized nuclear norm penalty, which requires an appropriate hyperparameter that can be difficult to determine in practice. Furthermore, existing compression techniques are not directly applicable to efficient over-the-air (OTA) aggregation in federated learning (FL) systems for distributed Internet-of-Things (IoT) scenarios. In this paper, we propose a novel manifold optimization formulation for low-rank model compression in FL that does not relax the low-rank constraint. Our optimization is conducted directly over the low-rank manifold, guaranteeing that the model is exactly low-rank. We also introduce a consensus penalty in the optimization formulation to support OTA aggregation. Based on our optimization formulation, we propose an alternating Riemannian optimization algorithm with a precoder that enables efficient OTA aggregation of low-rank local models without sacrificing training performance. Additionally, we provide convergence analysis in terms of key system parameters and conduct extensive experiments with real-world datasets to demonstrate the effectiveness of our proposed Riemannian low-rank model compression scheme compared to various state-of-the-art baselines.

NIMar 24
Digital Twin Enabled Simultaneous Learning and Modeling for UAV-assisted Secure Communications with Eavesdropping Attacks

Jieting Yuan, Songhan Zhao, Ye Xue et al.

This paper focuses on secure communications in UAV-assisted wireless networks, which comprise multiple legitimate UAVs (LE-UAVs) and an intelligent eavesdropping UAV (EA-UAV). The intelligent EA-UAV can observe the LE-UAVs'transmission strategies and adaptively adjust its trajectory to maximize information interception. To counter this threat, we propose a mode-switching scheme that enables LE-UAVs to dynamically switch between the data transmission and jamming modes, thereby balancing data collection efficiency and communication security. However, acquiring full global network state information for LE-UAVs' decision-making incurs significant overhead, as the network state is highly dynamic and time-varying. To address this challenge, we propose a digital twin-enabled simultaneous learning and modeling (DT-SLAM) framework that allows LE-UAVs to learn policies efficiently within the DT, thereby avoiding frequent interactions with the real environment. To capture the competitive relationship between the EA-UAV and the LE-UAVs, we model their interactions as a multi-stage Stackelberg game and jointly optimize the GUs' transmission control, UAVs' trajectory planning, mode selection, and network formation to maximize overall secure throughput. Considering potential model mismatch between the DT and the real environment, we propose a robust proximal policy optimization (RPPO) algorithm that encourages LE-UAVs to explore service regions with higher uncertainty. Numerical results demonstrate that the proposed DT-SLAM framework effectively supports the learning process. Meanwhile, the RPPO algorithm converges about 12% faster and the secure throughput can be increased by 8.6% compared to benchmark methods.

LGAug 8, 2024
Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit

Duanyi Yao, Songze Li, Ye Xue et al.

Vertical federated learning (VFL), where each participating client holds a subset of data features, has found numerous applications in finance, healthcare, and IoT systems. However, adversarial attacks, particularly through the injection of adversarial examples (AEs), pose serious challenges to the security of VFL models. In this paper, we investigate such vulnerabilities through developing a novel attack to disrupt the VFL inference process, under a practical scenario where the adversary is able to adaptively corrupt a subset of clients. We formulate the problem of finding optimal attack strategies as an online optimization problem, which is decomposed into an inner problem of adversarial example generation (AEG) and an outer problem of corruption pattern selection (CPS). Specifically, we establish the equivalence between the formulated CPS problem and a multi-armed bandit (MAB) problem, and propose the Thompson sampling with Empirical maximum reward (E-TS) algorithm for the adversary to efficiently identify the optimal subset of clients for corruption. The key idea of E-TS is to introduce an estimation of the expected maximum reward for each arm, which helps to specify a small set of competitive arms, on which the exploration for the optimal arm is performed. This significantly reduces the exploration space, which otherwise can quickly become prohibitively large as the number of clients increases. We analytically characterize the regret bound of E-TS, and empirically demonstrate its capability of efficiently revealing the optimal corruption pattern with the highest attack success rate, under various datasets of popular VFL tasks.

LGNov 14, 2024Code
Beyond the Heatmap: A Rigorous Evaluation of Component Impact in MCTS-Based TSP Solvers

Xuanhao Pan, Chenguang Wang, Chaolong Ying et al.

The ``Heatmap + Monte Carlo Tree Search (MCTS)'' paradigm has recently emerged as a prominent framework for solving the Travelling Salesman Problem (TSP). While considerable effort has been devoted to enhancing heatmap sophistication through advanced learning models, this paper rigorously examines whether this emphasis is justified, critically assessing the relative impact of heatmap complexity versus MCTS configuration. Our extensive empirical analysis across diverse TSP scales, distributions, and benchmarks reveals two pivotal insights: 1) The configuration of MCTS strategies significantly influences solution quality, underscoring the importance of meticulous tuning to achieve optimal results and enabling valid comparisons among different heatmap methodologies. 2) A rudimentary, parameter-free heatmap based on the intrinsic $k$-nearest neighbor structure of TSP instances, when coupled with an optimally tuned MCTS, can match or surpass the performance of more sophisticated, learned heatmaps, demonstrating robust generalizability on problem scale and distribution shift. To facilitate rigorous and fair evaluations in future research, we introduce a streamlined pipeline for standardized MCTS hyperparameter tuning. Collectively, these findings challenge the prevalent assumption that heatmap complexity is the primary determinant of performance, advocating instead for a balanced integration and comprehensive evaluation of both learning and search components within this paradigm. Our code is available at: https://github.com/LOGO-CUHKSZ/rethink_mcts_tsp.

LGSep 17, 2025
RF-LSCM: Pushing Radiance Fields to Multi-Domain Localized Statistical Channel Modeling for Cellular Network Optimization

Bingsheng Peng, Shutao Zhang, Xi Zheng et al.

Accurate localized wireless channel modeling is a cornerstone of cellular network optimization, enabling reliable prediction of network performance during parameter tuning. Localized statistical channel modeling (LSCM) is the state-of-the-art channel modeling framework tailored for cellular network optimization. However, traditional LSCM methods, which infer the channel's Angular Power Spectrum (APS) from Reference Signal Received Power (RSRP) measurements, suffer from critical limitations: they are typically confined to single-cell, single-grid and single-carrier frequency analysis and fail to capture complex cross-domain interactions. To overcome these challenges, we propose RF-LSCM, a novel framework that models the channel APS by jointly representing large-scale signal attenuation and multipath components within a radiance field. RF-LSCM introduces a multi-domain LSCM formulation with a physics-informed frequency-dependent Attenuation Model (FDAM) to facilitate the cross frequency generalization as well as a point-cloud-aided environment enhanced method to enable multi-cell and multi-grid channel modeling. Furthermore, to address the computational inefficiency of typical neural radiance fields, RF-LSCM leverages a low-rank tensor representation, complemented by a novel Hierarchical Tensor Angular Modeling (HiTAM) algorithm. This efficient design significantly reduces GPU memory requirements and training time while preserving fine-grained accuracy. Extensive experiments on real-world multi-cell datasets demonstrate that RF-LSCM significantly outperforms state-of-the-art methods, achieving up to a 30% reduction in mean absolute error (MAE) for coverage prediction and a 22% MAE improvement by effectively fusing multi-frequency data.

SPSep 16, 2025
A Measurement Report Data-Driven Framework for Localized Statistical Channel Modeling

Xinyu Qin, Ye Xue, Qi Yan et al.

Localized statistical channel modeling (LSCM) is crucial for effective performance evaluation in digital twin-assisted network optimization. Solely relying on the multi-beam reference signal receiving power (RSRP), LSCM aims to model the localized statistical propagation environment by estimating the channel angular power spectrum (APS). However, existing methods rely heavily on drive test data with high collection costs and limited spatial coverage. In this paper, we propose a measurement report (MR) data-driven framework for LSCM, exploiting the low-cost and extensive collection of MR data. The framework comprises two novel modules. The MR localization module addresses the issue of missing locations in MR data by introducing a semi-supervised method based on hypergraph neural networks, which exploits multi-modal information via distance-aware hypergraph modeling and hypergraph convolution for location extraction. To enhance the computational efficiency and solution robustness, LSCM operates at the grid level. Compared to independently constructing geographically uniform grids and estimating channel APS, the joint grid construction and channel APS estimation module enhances robustness in complex environments with spatially non-uniform data by exploiting their correlation. This module alternately optimizes grid partitioning and APS estimation using clustering and improved sparse recovery for the ill-conditioned measurement matrix and incomplete observations. Through comprehensive experiments on a real-world MR dataset, we demonstrate the superior performance and robustness of our framework in localization and channel modeling.

LGAug 17, 2021
Aggregation Delayed Federated Learning

Ye Xue, Diego Klabjan, Yuan Luo

Federated learning is a distributed machine learning paradigm where multiple data owners (clients) collaboratively train one machine learning model while keeping data on their own devices. The heterogeneity of client datasets is one of the most important challenges of federated learning algorithms. Studies have found performance reduction with standard federated algorithms, such as FedAvg, on non-IID data. Many existing works on handling non-IID data adopt the same aggregation framework as FedAvg and focus on improving model updates either on the server side or on clients. In this work, we tackle this challenge in a different view by introducing redistribution rounds that delay the aggregation. We perform experiments on multiple tasks and show that the proposed framework significantly improves the performance on non-IID data.

LGApr 21, 2021
Efficient Sparse Coding using Hierarchical Riemannian Pursuit

Ye Xue, Vincent Lau, Songfu Cai

Sparse coding is a class of unsupervised methods for learning a sparse representation of the input data in the form of a linear combination of a dictionary and a sparse code. This learning framework has led to state-of-the-art results in various image and video processing tasks. However, classical methods learn the dictionary and the sparse code based on alternating optimizations, usually without theoretical guarantees for either optimality or convergence due to non-convexity of the problem. Recent works on sparse coding with a complete dictionary provide strong theoretical guarantees thanks to the development of the non-convex optimization. However, initial non-convex approaches learn the dictionary in the sparse coding problem sequentially in an atom-by-atom manner, which leads to a long execution time. More recent works seek to directly learn the entire dictionary at once, which substantially reduces the execution time. However, the associated recovery performance is degraded with a finite number of data samples. In this paper, we propose an efficient sparse coding scheme with a two-stage optimization. The proposed scheme leverages the global and local Riemannian geometry of the two-stage optimization problem and facilitates fast implementation for superb dictionary recovery performance by a finite number of samples without atom-by-atom calculation. We further prove that, with high probability, the proposed scheme can exactly recover any atom in the target dictionary with a finite number of samples if it is adopted to recover one atom of the dictionary. An application on wireless sensor data compression is also proposed. Experiments on both synthetic and real-world data verify the efficiency and effectiveness of the proposed scheme.

LGMar 2, 2021
Online Orthogonal Dictionary Learning Based on Frank-Wolfe Method

Ye Xue, Vincent Lau

Dictionary learning is a widely used unsupervised learning method in signal processing and machine learning. Most existing works of dictionary learning are in an offline manner. There are mainly two offline ways for dictionary learning. One is to do an alternative optimization of both the dictionary and the sparse code; the other way is to optimize the dictionary by restricting it over the orthogonal group. The latter one is called orthogonal dictionary learning which has a lower complexity implementation, hence, it is more favorable for lowcost devices. However, existing schemes on orthogonal dictionary learning only work with batch data and can not be implemented online, which is not applicable for real-time applications. This paper proposes a novel online orthogonal dictionary scheme to dynamically learn the dictionary from streaming data without storing the historical data. The proposed scheme includes a novel problem formulation and an efficient online algorithm design with convergence analysis. In the problem formulation, we relax the orthogonal constraint to enable an efficient online algorithm. In the algorithm design, we propose a new Frank-Wolfe-based online algorithm with a convergence rate of O(ln t/t^(1/4)). The convergence rate in terms of key system parameters is also derived. Experiments with synthetic data and real-world sensor readings demonstrate the effectiveness and efficiency of the proposed online orthogonal dictionary learning scheme.

LGFeb 24, 2020
Complete Dictionary Learning via $\ell_p$-norm Maximization

Yifei Shen, Ye Xue, Jun Zhang et al.

Dictionary learning is a classic representation learning method that has been widely applied in signal processing and data analytics. In this paper, we investigate a family of $\ell_p$-norm ($p>2,p \in \mathbb{N}$) maximization approaches for the complete dictionary learning problem from theoretical and algorithmic aspects. Specifically, we prove that the global maximizers of these formulations are very close to the true dictionary with high probability, even when Gaussian noise is present. Based on the generalized power method (GPM), an efficient algorithm is then developed for the $\ell_p$-based formulations. We further show the efficacy of the developed algorithm: for the population GPM algorithm over the sphere constraint, it first quickly enters the neighborhood of a global maximizer, and then converges linearly in this region. Extensive experiments will demonstrate that the $\ell_p$-based approaches enjoy a higher computational efficiency and better robustness than conventional approaches and $p=3$ performs the best.

LGAug 12, 2019
Mixture-based Multiple Imputation Model for Clinical Data with a Temporal Dimension

Ye Xue, Diego Klabjan, Yuan Luo

The problem of missing values in multivariable time series is a key challenge in many applications such as clinical data mining. Although many imputation methods show their effectiveness in many applications, few of them are designed to accommodate clinical multivariable time series. In this work, we propose a multiple imputation model that capture both cross-sectional information and temporal correlations. We integrate Gaussian processes with mixture models and introduce individualized mixing weights to handle the variance of predictive confidence of Gaussian process models. The proposed model is compared with several state-of-the-art imputation algorithms on both real-world and synthetic datasets. Experiments show that our best model can provide more accurate imputation than the benchmarks on all of our datasets.