LGSTOct 4, 2023

Estimation of Models with Limited Data by Leveraging Shared Structure

MIT
arXiv:2310.02864v11 citationsh-index: 49
Originality Incremental advance
AI Analysis

This addresses a common issue in fields like healthcare and e-commerce where data per source is limited, offering a way to improve parameter estimation by leveraging shared structure, though it appears incremental as it builds on existing subspace methods.

The paper tackles the problem of estimating high-dimensional model parameters for multiple systems when each has insufficient data, by assuming shared latent structure and proposing a three-step algorithm to recover parameters even with fewer observations than dimensions. They provide finite sample error guarantees and validate the method on simulated regression and time series data.

Modern data sets, such as those in healthcare and e-commerce, are often derived from many individuals or systems but have insufficient data from each source alone to separately estimate individual, often high-dimensional, model parameters. If there is shared structure among systems however, it may be possible to leverage data from other systems to help estimate individual parameters, which could otherwise be non-identifiable. In this paper, we assume systems share a latent low-dimensional parameter space and propose a method for recovering $d$-dimensional parameters for $N$ different linear systems, even when there are only $T<d$ observations per system. To do so, we develop a three-step algorithm which estimates the low-dimensional subspace spanned by the systems' parameters and produces refined parameter estimates within the subspace. We provide finite sample subspace estimation error guarantees for our proposed method. Finally, we experimentally validate our method on simulations with i.i.d. regression data and as well as correlated time series data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes