MLApr 21, 2015

Nonparametric Testing for Heterogeneous Correlation

arXiv:1504.05392v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for statistical methods to identify subpopulations with stronger correlations, which is incremental as it builds on existing nonparametric approaches.

The paper tackles the problem of detecting heterogeneous correlation in data with weak overall correlation by comparing two nonparametric testing procedures based on rankings, and demonstrates their application in identifying heterogeneity in wine quality data.

In the presence of weak overall correlation, it may be useful to investigate if the correlation is significantly and substantially more pronounced over a subpopulation. Two different testing procedures are compared. Both are based on the rankings of the values of two variables from a data set with a large number n of observations. The first maintains its level against Gaussian copulas; the second adapts to general alternatives in the sense that that the number of parameters used in the test grows with n. An analysis of wine quality illustrates how the methods detect heterogeneity of association between chemical properties of the wine, which are attributable to a mix of different cultivars.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes