MLLGFeb 25, 2021

Time-Series Imputation with Wasserstein Interpolation for Optimal Look-Ahead-Bias and Variance Tradeoff

arXiv:2102.12736v22 citations
AI Analysis

This addresses a practical issue in finance and other domains where imputation affects model performance, though it appears incremental as it builds on existing imputation methods.

The paper tackles the problem of missing time-series data in imputation methods, which can cause look-ahead-bias in downstream tasks like portfolio optimization, by proposing a Bayesian posterior consensus distribution to optimally control the trade-off between bias and variance, demonstrating benefits in synthetic and real financial data.

Missing time-series data is a prevalent practical problem. Imputation methods in time-series data often are applied to the full panel data with the purpose of training a model for a downstream out-of-sample task. For example, in finance, imputation of missing returns may be applied prior to training a portfolio optimization model. Unfortunately, this practice may result in a look-ahead-bias in the future performance on the downstream task. There is an inherent trade-off between the look-ahead-bias of using the full data set for imputation and the larger variance in the imputation from using only the training data. By connecting layers of information revealed in time, we propose a Bayesian posterior consensus distribution which optimally controls the variance and look-ahead-bias trade-off in the imputation. We demonstrate the benefit of our methodology both in synthetic and real financial data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes