h-index8
9papers
40citations
Novelty35%
AI Score40

9 Papers

NAMay 6, 2018
On the Ideal Interpolation Operator in Algebraic Multigrid Methods

Xuefeng Xu, Chen-Song Zhang

Various algebraic multigrid algorithms have been developed for solving problems in scientific and engineering computation over the past decades. They have been shown to be well-suited for solving discretized partial differential equations on unstructured girds in practice. One key ingredient of algebraic multigrid algorithms is a strategy for constructing an effective prolongation operator. Among many questions on constructing a prolongation, an important question is how to evaluate its quality. In this paper, we establish new characterizations (including sufficient condition, necessary condition, and equivalent condition) of the so-called ideal interpolation operator. Our result suggests that, compared with common wisdom, one has more room to construct an ideal interpolation, which can provide new insights for designing algebraic multigrid algorithms. Moreover, we derive a new expression for a class of ideal interpolation operators.

NAApr 19, 2017
Generalization of the Sherman-Morrison-Woodbury formula involving the Schur complement

Xuefeng Xu

Let $X\in\mathbb{C}^{m\times m}$ and $Y\in\mathbb{C}^{n\times n}$ be nonsingular matrices, and let $N\in\mathbb{C}^{m\times n}$. Explicit expressions for the Moore-Penrose inverses of $M=XNY$ and a two-by-two block matrix, under appropriate conditions, have been established by Castro-González et al. [Linear Algebra Appl. 471 (2015) 353-368]. Based on these results, we derive a novel expression for the Moore-Penrose inverse of $A+UV^{\ast}$ under suitable conditions, where $A\in \mathbb{C}^{m\times n}$, $U\in \mathbb{C}^{m\times r}$, and $V\in \mathbb{C}^{n\times r}$. In particular, if both $A$ and $I+V^{\ast}A^{-1}U$ are nonsingular matrices, our expression reduces to the celebrated Sherman-Morrison-Woodbury formula. Moreover, we extend our results to the bounded linear operators case.

NAOct 15, 2016
An improved upper bound for the number of distinct eigenvalues of a matrix after perturbation

Xuefeng Xu

An upper bound for the number of distinct eigenvalues of a perturbed matrix has been recently established by P. E. Farrell [1, Theorem 1.3]. The estimate is the central result in Farrell's work and can be applied to estimate the number of Krylov iterations required for solving a perturbed linear system. In this paper, we present an improved upper bound for the number of distinct eigenvalues of a matrix after perturbation. Furthermore, some results based on the improved estimate are presented.

LGFeb 11
FedPS: Federated data Preprocessing via aggregated Statistics

Xuefeng Xu, Graham Cormode

Federated Learning (FL) enables multiple parties to collaboratively train machine learning models without sharing raw data. However, before training, data must be preprocessed to address missing values, inconsistent formats, and heterogeneous feature scales. This preprocessing stage is critical for model performance but is largely overlooked in FL research. In practical FL systems, privacy constraints prohibit centralizing raw data, while communication efficiency introduces further challenges for distributed preprocessing. We introduce FedPS, a unified framework for federated data preprocessing based on aggregated statistics. FedPS leverages data-sketching techniques to efficiently summarize local datasets while preserving essential statistical information. Building on these summaries, we design federated algorithms for feature scaling, encoding, discretization, and missing-value imputation, and extend preprocessing-related models such as k-Means, k-Nearest Neighbors, and Bayesian Linear Regression to both horizontal and vertical FL settings. FedPS provides flexible, communication-efficient, and consistent preprocessing pipelines for practical FL deployments.

NADec 4, 2016
A new expression for the Moore-Penrose inverse of a class of matrices

Xuefeng Xu

An expression for the Moore-Penrose inverse of a matrix of the form M = XNY , where X and Y are nonsingular, has been recently established by Castro-González et al. [1, Theorem 2.2]. The expression plays an essential role in developing explicit expressions for the Moore-Penrose inverse of a two-by-two block matrix. In this paper, we present a new expression for the Moore-Penrose inverse of this class of matrices, which improves the result in [1].

LGOct 6, 2025
Power Transform Revisited: Numerically Stable, and Federated

Xuefeng Xu, Graham Cormode

Power transforms are popular parametric techniques for making data more Gaussian-like, and are widely used as preprocessing steps in statistical analysis and machine learning. However, we find that direct implementations of power transforms suffer from severe numerical instabilities, which can lead to incorrect results or even crashes. In this paper, we provide a comprehensive analysis of the sources of these instabilities and propose effective remedies. We further extend power transforms to the federated learning setting, addressing both numerical and distributional challenges that arise in this context. Experiments on real-world datasets demonstrate that our methods are both effective and robust, substantially improving stability compared to existing approaches.

LGOct 6, 2025
Federated Computation of ROC and PR Curves

Xuefeng Xu, Graham Cormode

Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves are fundamental tools for evaluating machine learning classifiers, offering detailed insights into the trade-offs between true positive rate vs. false positive rate (ROC) or precision vs. recall (PR). However, in Federated Learning (FL) scenarios, where data is distributed across multiple clients, computing these curves is challenging due to privacy and communication constraints. Specifically, the server cannot access raw prediction scores and class labels, which are used to compute the ROC and PR curves in a centralized setting. In this paper, we propose a novel method for approximating ROC and PR curves in a federated setting by estimating quantiles of the prediction score distribution under distributed differential privacy. We provide theoretical bounds on the Area Error (AE) between the true and estimated curves, demonstrating the trade-offs between approximation accuracy, privacy, and communication cost. Empirical results on real-world datasets demonstrate that our method achieves high approximation accuracy with minimal communication and strong privacy guarantees, making it practical for privacy-preserving model evaluation in federated systems.

NAJul 2, 2017
New perturbation bounds for the spectrum of a normal matrix

Xuefeng Xu, Chen-Song Zhang

Let $A\in\mathbb{C}^{n\times n}$ and $\widetilde{A}\in\mathbb{C}^{n\times n}$ be two normal matrices with spectra $\{λ_{i}\}_{i=1}^{n}$ and $\{\widetildeλ_{i}\}_{i=1}^{n}$, respectively. The celebrated Hoffman--Wielandt theorem states that there exists a permutation $π$ of $\{1,\ldots,n\}$ such that $\left(\sum_{i=1}^{n}\big|\widetildeλ_{π(i)}-λ_{i}\big|^{2}\right)^{1\over 2}$ is no larger than the Frobenius norm of $\widetilde{A}-A$. However, if either $A$ or $\widetilde{A}$ is non-normal, this result does not hold in general. In this paper, we present several novel upper bounds for $\left(\sum_{i=1}^{n}\big|\widetildeλ_{π(i)}-λ_{i}\big|^{2}\right)^{1\over 2}$, provided that $A$ is normal and $\widetilde{A}$ is arbitrary. Some of these estimates involving the "departure from normality" of $\widetilde{A}$ have generalized the Hoffman--Wielandt theorem. Furthermore, we give new perturbation bounds for the spectrum of a Hermitian matrix.