Unsupervised Learning of Density Estimates with Topological Optimization
This addresses the bandwidth tuning problem in density estimation for machine learning and related fields, though it appears incremental as it builds on existing topological data analysis methods.
The paper tackles the problem of unsupervised kernel density estimation by automating the critical bandwidth selection using a topology-based loss function, demonstrating its potential across different dimensions compared to classical techniques.
Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a crucial hyperparameter: the kernel bandwidth. The choice of bandwidth is critical as it controls the bias-variance trade-off by over- or under-smoothing the topological features. Topological data analysis provides methods to mathematically quantify topological characteristics, such as connected components, loops, voids et cetera, even in high dimensions where visualization of density estimates is impossible. In this paper, we propose an unsupervised learning approach using a topology-based loss function for the automated and unsupervised selection of the optimal bandwidth and benchmark it against classical techniques -- demonstrating its potential across different dimensions.