Victor M. Panaretos

ML
h-index17
3papers
3citations
Novelty68%
AI Score36

3 Papers

MLAug 11, 2025
Likelihood Ratio Tests by Kernel Gaussian Embedding

Leonardo V. Santoro, Victor M. Panaretos

We propose a novel kernel-based nonparametric two-sample test, employing the combined use of kernel mean and kernel covariance embedding. Our test builds on recent results showing how such combined embeddings map distinct probability measures to mutually singular Gaussian measures on the kernel's RKHS. Leveraging this ``separation of measure phenomenon", we construct a test statistic based on the relative entropy between the Gaussian embeddings, in effect the likelihood ratio. The likelihood ratio is specifically tailored to detect equality versus singularity of two Gaussians, and satisfies a ``$0/\infty$" law, in that it vanishes under the null and diverges under the alternative. To implement the test in finite samples, we introduce a regularised version, calibrated by way of permutation. We prove consistency, establish uniform power guarantees under mild conditions, and discuss how our framework unifies and extends prior approaches based on spectrally regularized MMD. Empirical results on synthetic and real data demonstrate remarkable gains in power compared to state-of-the-art methods, particularly in high-dimensional and weak-signal regimes.

MLMay 7, 2025
Kernel Embeddings and the Separation of Measure Phenomenon

Leonardo V. Santoro, Kartik G. Waghmare, Victor M. Panaretos

We prove that kernel covariance embeddings lead to information-theoretically perfect separation of distinct probability distributions. In statistical terms, we establish that testing for the equality of two probability measures on a compact and separable metric space is equivalent to testing for the singularity between two centered Gaussian measures on a reproducing kernel Hilbert Space. The corresponding Gaussians are defined via the notion of kernel covariance embedding of a probability measure, and the Hilbert space is that generated by the embedding kernel. Distinguishing singular Gaussians is fundamentally simpler from an information-theoretic perspective than non-parametric two-sample testing, particularly in complex or high-dimensional domains. This is because singular Gaussians are supported on essentially separate and affine subspaces. Our proof leverages the classical Feldman-Hajek dichotomy, and shows that even a small perturbation of a distribution will be maximally magnified through its Gaussian embedding. This ``separation of measure phenomenon'' appears to be a blessing of infinite dimensionality, by means of embedding, with the potential to inform the design of efficient inference tools in considerable generality. The elicitation of this phenomenon also appears to crystallize, in a precise and simple mathematical statement, the outstanding empirical effectiveness of the so-called ``kernel trick".

MEApr 11, 2021
CovNet: Covariance Networks for Functional Data on Multidimensional Domains

Soham Sarkar, Victor M. Panaretos

Covariance estimation is ubiquitous in functional data analysis. Yet, the case of functional observations over multidimensional domains introduces computational and statistical challenges, rendering the standard methods effectively inapplicable. To address this problem, we introduce "Covariance Networks" (CovNet) as a modeling and estimation tool. The CovNet model is "universal" - it can be used to approximate any covariance up to desired precision. Moreover, the model can be fitted efficiently to the data and its neural network architecture allows us to employ modern computational tools in the implementation. The CovNet model also admits a closed-form eigendecomposition, which can be computed efficiently, without constructing the covariance itself. This facilitates easy storage and subsequent manipulation of a covariance in the context of the CovNet. We establish consistency of the proposed estimator and derive its rate of convergence. The usefulness of the proposed method is demonstrated by means of an extensive simulation study and an application to resting state fMRI data.