ML LGDec 27, 2021

Computationally Efficient Approximations for Matrix-based Renyi's Entropy

Tieliang Gong, Yuxin Dong, Shujian Yu, Bo Dong

arXiv:2112.13720v46.37 citations

Originality Incremental advance

AI Analysis

This work addresses a computational bottleneck for researchers and practitioners using matrix-based Renyi's entropy in statistical inference and learning tasks, offering incremental improvements to enable broader application.

The paper tackles the high computational complexity of matrix-based Renyi's entropy, which scales as O(n^3), by developing efficient approximations using randomized numerical linear algebra and matrix structure exploitation, reducing complexity to less than O(n^2) with negligible accuracy loss in experiments.

The recently developed matrix based Renyi's entropy enables measurement of information in data simply using the eigenspectrum of symmetric positive semi definite (PSD) matrices in reproducing kernel Hilbert space, without estimation of the underlying data distribution. This intriguing property makes the new information measurement widely adopted in multiple statistical inference and learning tasks. However, the computation of such quantity involves the trace operator on a PSD matrix $G$ to power $α$(i.e., $tr(G^α)$), with a normal complexity of nearly $O(n^3)$, which severely hampers its practical usage when the number of samples (i.e., $n$) is large. In this work, we present computationally efficient approximations to this new entropy functional that can reduce its complexity to even significantly less than $O(n^2)$. To this end, we leverage the recent progress on Randomized Numerical Linear Algebra, developing Taylor, Chebyshev and Lanczos approximations to $tr(G^α)$ for arbitrary values of $α$ by converting it into matrix-vector multiplications problem. We also establish the connection between the matrix-based Renyi's entropy and PSD matrix approximation, which enables exploiting both clustering and block low-rank structure of $G$ to further reduce the computational cost. We theoretically provide approximation accuracy guarantees and illustrate the properties of different approximations. Large-scale experimental evaluations on both synthetic and real-world data corroborate our theoretical findings, showing promising speedup with negligible loss in accuracy.

View on arXiv PDF

Similar