Jiwei Zhao

ML
h-index32
9papers
79citations
Novelty58%
AI Score45

9 Papers

MENov 23, 2023
Assumption-Lean and Data-Adaptive Post-Prediction Inference

Jiacheng Miao, Xinran Miao, Yixuan Wu et al.

A primary challenge facing modern scientific research is the limited availability of gold-standard data which can be costly, labor-intensive, or invasive to obtain. With the rapid development of machine learning (ML), scientists can now employ ML algorithms to predict gold-standard outcomes with variables that are easier to obtain. However, these predicted outcomes are often used directly in subsequent statistical analyses, ignoring imprecision and heterogeneity introduced by the prediction procedure. This will likely result in false positive findings and invalid scientific conclusions. In this work, we introduce PoSt-Prediction Adaptive inference (PSPA) that allows valid and powerful inference based on ML-predicted data. Its "assumption-lean" property guarantees reliable statistical inference without assumptions on the ML prediction. Its "data-adaptive" feature guarantees an efficiency gain over existing methods, regardless of the accuracy of ML prediction. We demonstrate the statistical superiority and broad applicability of our method through simulations and real-data applications.

ITNov 30, 2023
Channel-Feedback-Free Transmission for Downlink FD-RAN: A Radio Map based Complex-valued Precoding Network Approach

Jiwei Zhao, Jiacheng Chen, Zeyu Sun et al.

As the demand for high-quality services proliferates, an innovative network architecture, the fully-decoupled RAN (FD-RAN), has emerged for more flexible spectrum resource utilization and lower network costs. However, with the decoupling of uplink base stations and downlink base stations in FD-RAN, the traditional transmission mechanism, which relies on real-time channel feedback, is not suitable as the receiver is not able to feedback accurate and timely channel state information to the transmitter. This paper proposes a novel transmission scheme without relying on physical layer channel feedback. Specifically, we design a radio map based complex-valued precoding network~(RMCPNet) model, which outputs the base station precoding based on user location. RMCPNet comprises multiple subnets, with each subnet responsible for extracting unique modal features from diverse input modalities. Furthermore, the multi-modal embeddings derived from these distinct subnets are integrated within the information fusion layer, culminating in a unified representation. We also develop a specific RMCPNet training algorithm that employs the negative spectral efficiency as the loss function. We evaluate the performance of the proposed scheme on the public DeepMIMO dataset and show that RMCPNet can achieve 16\% and 76\% performance improvements over the conventional real-valued neural network and statistical codebook approach, respectively.

MLJan 29, 2024
ReTaSA: A Nonparametric Functional Estimation Approach for Addressing Continuous Target Shift

Hwanwoo Kim, Xin Zhang, Jiwei Zhao et al.

The presence of distribution shifts poses a significant challenge for deploying modern machine learning models in real-world applications. This work focuses on the target shift problem in a regression setting (Zhang et al., 2013; Nguyen et al., 2016). More specifically, the target variable y (also known as the response variable), which is continuous, has different marginal distributions in the training source and testing domain, while the conditional distribution of features x given y remains the same. While most literature focuses on classification tasks with finite target space, the regression problem has an infinite dimensional target space, which makes many of the existing methods inapplicable. In this work, we show that the continuous target shift problem can be addressed by estimating the importance weight function from an ill-posed integral equation. We propose a nonparametric regularized approach named ReTaSA to solve the ill-posed integral equation and provide theoretical justification for the estimated importance weight function. The effectiveness of the proposed method has been demonstrated with extensive numerical studies on synthetic and real-world datasets.

MLSep 26, 2025
SADA: Safe and Adaptive Inference with Multiple Black-Box Predictions

Jiawei Shan, Yiming Dong, Jiwei Zhao

Real-world applications often face scarce labeled data due to the high cost and time requirements of gold-standard experiments, whereas unlabeled data are typically abundant. With the growing adoption of machine learning techniques, it has become increasingly feasible to generate multiple predicted labels using a variety of models and algorithms, including deep learning, large language models, and generative AI. In this paper, we propose a novel approach that safely and adaptively aggregates multiple black-box predictions with unknown quality while preserving valid statistical inference. Our method provides two key guarantees: (i) it never performs worse than using the labeled data alone, regardless of the quality of the predictions; and (ii) if any one of the predictions (without knowing which one) perfectly fits the ground truth, the algorithm adaptively exploits this to achieve either a faster convergence rate or the semiparametric efficiency bound. We demonstrate the effectiveness of the proposed algorithm through experiments on both synthetic and benchmark datasets.

MLSep 24, 2025
Unsupervised Domain Adaptation with an Unobservable Source Subpopulation

Chao Ying, Jun Jin, Haotian Zhang et al.

We study an unsupervised domain adaptation problem where the source domain consists of subpopulations defined by the binary label $Y$ and a binary background (or environment) $A$. We focus on a challenging setting in which one such subpopulation in the source domain is unobservable. Naively ignoring this unobserved group can result in biased estimates and degraded predictive performance. Despite this structured missingness, we show that the prediction in the target domain can still be recovered. Specifically, we rigorously derive both background-specific and overall prediction models for the target domain. For practical implementation, we propose the distribution matching method to estimate the subpopulation proportions. We provide theoretical guarantees for the asymptotic behavior of our estimator, and establish an upper bound on the prediction error. Experiments on both synthetic and real-world datasets show that our method outperforms the naive benchmark that does not account for this unobservable source subpopulation.

MED-PHJun 12, 2025
Modality-AGnostic Image Cascade (MAGIC) for Multi-Modality Cardiac Substructure Segmentation

Nicholas Summerfield, Qisheng He, Alex Kuo et al.

Cardiac substructure delineation is emerging in treatment planning to minimize the risk of radiation-induced heart disease. Deep learning offers efficient methods to reduce contouring burden but currently lacks generalizability across different modalities and overlapping structures. This work introduces and validates a Modality-AGnostic Image Cascade (MAGIC) deep-learning pipeline for comprehensive and multi-modal cardiac substructure segmentation. MAGIC is implemented through replicated encoding and decoding branches of an nnU-Net backbone to handle multi-modality inputs and overlapping labels. First benchmarked on the multi-modality whole-heart segmentation (MMWHS) dataset including cardiac CT-angiography (CCTA) and MR modalities, twenty cardiac substructures (heart, chambers, great vessels (GVs), valves, coronary arteries (CAs), and conduction nodes) from clinical simulation CT (Sim-CT), low-field MR-Linac, and cardiac CT-angiography (CCTA) modalities were delineated to train semi-supervised (n=151), validate (n=15), and test (n=30) MAGIC. For comparison, fourteen single-modality comparison models (two MMWHS modalities and four subgroups across three clinical modalities) were trained. Methods were evaluated for efficiency and against reference contours through the Dice similarity coefficient (DSC) and two-tailed Wilcoxon Signed-Rank test (p<0.05). Average MMWHS DSC scores across CCTA and MR inputs were 0.88(0.08) and 0.87(0.04) respectively with significant improvement over unimodal baselines. Average 20-structure DSC scores were 0.75(0.16) for Sim-CT, 0.68(0.21) for MR-Linac, and 0.80(0.16) for CCTA. Furthermore, >80% and >70% reductions in training time and parameters were achieved, respectively. MAGIC offers an efficient, lightweight solution capable of segmenting multiple image modalities and overlapping structures in a single model without compromising segmentation accuracy.

MLMay 30, 2023
ELSA: Efficient Label Shift Adaptation through the Lens of Semiparametric Models

Qinglong Tian, Xin Zhang, Jiwei Zhao

We study the domain adaptation problem with label shift in this work. Under the label shift context, the marginal distribution of the label varies across the training and testing datasets, while the conditional distribution of features given the label is the same. Traditional label shift adaptation methods either suffer from large estimation errors or require cumbersome post-prediction calibrations. To address these issues, we first propose a moment-matching framework for adapting the label shift based on the geometry of the influence function. Under such a framework, we propose a novel method named \underline{E}fficient \underline{L}abel \underline{S}hift \underline{A}daptation (ELSA), in which the adaptation weights can be estimated by solving linear systems. Theoretically, the ELSA estimator is $\sqrt{n}$-consistent ($n$ is the sample size of the source data) and asymptotically normal. Empirically, we show that ELSA can achieve state-of-the-art estimation performances without post-prediction calibrations, thus, gaining computational efficiency.

MENov 28, 2020
Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning

Siyi Deng, Yang Ning, Jiwei Zhao et al.

We consider the estimation problem in high-dimensional semi-supervised learning. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation of the regression parameters of linear model in light of the fact that such linear models may be misspecified in data analysis. We first establish the minimax lower bound for parameter estimation in the semi-supervised setting, and show that this lower bound cannot be achieved by supervised estimators using the labeled data only. We propose an optimal semi-supervised estimator that can attain this lower bound and therefore improves the supervised estimators, provided that the conditional mean function can be consistently estimated with a proper rate. We further propose a safe semi-supervised estimator. We view it safe, because this estimator is always at least as good as the supervised estimators. We also extend our idea to the aggregation of multiple semi-supervised estimators caused by different misspecifications of the conditional mean function. Extensive numerical simulations and a real data analysis are conducted to illustrate our theoretical results.

STMay 26, 2019
Nonregular and Minimax Estimation of Individualized Thresholds in High Dimension with Binary Responses

Huijie Feng, Yang Ning, Jiwei Zhao

Given a large number of covariates $Z$, we consider the estimation of a high-dimensional parameter $θ$ in an individualized linear threshold $θ^T Z$ for a continuous variable $X$, which minimizes the disagreement between $\text{sign}(X-θ^TZ)$ and a binary response $Y$. While the problem can be formulated into the M-estimation framework, minimizing the corresponding empirical risk function is computationally intractable due to discontinuity of the sign function. Moreover, estimating $θ$ even in the fixed-dimensional setting is known as a nonregular problem leading to nonstandard asymptotic theory. To tackle the computational and theoretical challenges in the estimation of the high-dimensional parameter $θ$, we propose an empirical risk minimization approach based on a regularized smoothed loss function. The statistical and computational trade-off of the algorithm is investigated. Statistically, we show that the finite sample error bound for estimating $θ$ in $\ell_2$ norm is $(s\log d/n)^{β/(2β+1)}$, where $d$ is the dimension of $θ$, $s$ is the sparsity level, $n$ is the sample size and $β$ is the smoothness of the conditional density of $X$ given the response $Y$ and the covariates $Z$. The convergence rate is nonstandard and slower than that in the classical Lasso problems. Furthermore, we prove that the resulting estimator is minimax rate optimal up to a logarithmic factor. The Lepski's method is developed to achieve the adaption to the unknown sparsity $s$ and smoothness $β$. Computationally, an efficient path-following algorithm is proposed to compute the solution path. We show that this algorithm achieves geometric rate of convergence for computing the whole path. Finally, we evaluate the finite sample performance of the proposed estimator in simulation studies and a real data analysis.