EMMEMLJan 29, 2020

Blocked Clusterwise Regression

arXiv:2001.11130v12 citations
Originality Incremental advance
AI Analysis

This addresses model misspecification issues in econometrics for researchers analyzing panel data with discrete cross-sectional structure, though it is incremental as it generalizes previous approaches.

The paper tackles the problem of misspecification in clustered heterogeneity models for panel data by introducing a model that allows multiple imperfectly-correlated latent variables per unit, and it shows through Monte Carlo simulations that this approach improves estimation and model selection while deriving new convergence rates for over-specified clusters.

A recent literature in econometrics models unobserved cross-sectional heterogeneity in panel data by assigning each cross-sectional unit a one-dimensional, discrete latent type. Such models have been shown to allow estimation and inference by regression clustering methods. This paper is motivated by the finding that the clustered heterogeneity models studied in this literature can be badly misspecified, even when the panel has significant discrete cross-sectional structure. To address this issue, we generalize previous approaches to discrete unobserved heterogeneity by allowing each unit to have multiple, imperfectly-correlated latent variables that describe its response-type to different covariates. We give inference results for a k-means style estimator of our model and develop information criteria to jointly select the number clusters for each latent variable. Monte Carlo simulations confirm our theoretical results and give intuition about the finite-sample performance of estimation and model selection. We also contribute to the theory of clustering with an over-specified number of clusters and derive new convergence rates for this setting. Our results suggest that over-fitting can be severe in k-means style estimators when the number of clusters is over-specified.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes