Leveraging Historical Data for High-Dimensional Regression Adjustment, a Composite Covariate Approach
This addresses the challenge of limited patient data in early-phase clinical trials for researchers, but it is incremental as it builds on existing covariate adjustment methods.
The paper tackles the problem of determining the optimal number of covariates to include in small clinical trials to improve analysis power, and it proposes using historical data to create a composite covariate that reduces overfitting and degrees of freedom loss.
The amount of data collected from patients involved in clinical trials is continuously growing. All patient characteristics are potential covariates that could be used to improve clinical trial analysis and power. However, the restricted number of patients in phases I and II studies limits the possible number of covariates included in the analyses. In this paper, we investigate the cost/benefit ratio of including covariates in the analysis of clinical trials. Within this context, we address the long-running question "What is the optimum number of covariates to include in a clinical trial?" To further improve the cost/benefit ratio of covariates, historical data can be leveraged to pre-specify the covariate weights, which can be viewed as the definition of a new composite covariate. We analyze the use of a composite covariate while estimating the treatment effect in small clinical trials. A composite covariate limits the loss of degrees of freedom and the risk of overfitting.