MLLGSep 3, 2024

Deep non-parametric logistic model with case-control data and external summary information

arXiv:2409.01829v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses imbalanced binary data estimation in statistical modeling, offering an incremental improvement by integrating external information into a deep learning framework.

The authors tackled the problem of estimating a non-parametric logistic model with imbalanced case-control data by incorporating external summary information to ensure identifiability, achieving an optimal convergence rate in non-parametric regression estimation.

The case-control sampling design serves as a pivotal strategy in mitigating the imbalanced structure observed in binary data. We consider the estimation of a non-parametric logistic model with the case-control data supplemented by external summary information. The incorporation of external summary information ensures the identifiability of the model. We propose a two-step estimation procedure. In the first step, the external information is utilized to estimate the marginal case proportion. In the second step, the estimated proportion is used to construct a weighted objective function for parameter training. A deep neural network architecture is employed for functional approximation. We further derive the non-asymptotic error bound of the proposed estimator. Following this the convergence rate is obtained and is shown to reach the optimal speed of the non-parametric regression estimation. Simulation studies are conducted to evaluate the theoretical findings of the proposed method. A real data example is analyzed for illustration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes