STAPCOMEMLDec 27, 2017

On the estimation of correlation in a binary sequence model

arXiv:1712.09694v2
AI Analysis

This work addresses a fundamental statistical estimation problem for researchers in binary data modeling, revealing a phase transition in estimability based on data discretization.

The paper tackles the problem of estimating a common correlation parameter from binary sequences generated by thresholding hidden continuous variables, and finds that maximum likelihood estimation fails to provide consistent estimates, while trinary data can achieve consistent estimation with parametric convergence rates.

We consider a binary sequence generated by thresholding a hidden continuous sequence. The hidden variables are assumed to have a compound symmetry covariance structure with a single parameter characterizing the common correlation. We study the parameter estimation problem under such one-parameter models. We demonstrate that maximizing the likelihood function does not yield consistent estimates for the correlation. We then formally prove the nonestimability of the parameter by deriving a non-vanishing minimax lower bound. This counter-intuitive phenomenon provides an interesting insight that one-bit information of each latent variable is not sufficient to consistently recover their common correlation. On the other hand, we further show that trinary data generated from the hidden variables can consistently estimate the correlation with parametric convergence rate. Thus we reveal a phase transition phenomenon regarding the discretization of latent continuous variables while preserving the estimability of the correlation. Numerical experiments are performed to validate the conclusions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes