Random projection tree similarity metric for SpectralNet
This is an incremental improvement for graph clustering methods, specifically enhancing SpectralNet's performance and efficiency.
The authors tackled the problem of improving SpectralNet's graph clustering by replacing the standard k-nn graph similarity metric with one based on random projection trees (rpTrees), resulting in better clustering accuracy and reduced sensitivity to rpTree parameters like leaf size and projection direction.
SpectralNet is a graph clustering method that uses neural network to find an embedding that separates the data. So far it was only used with $k$-nn graphs, which are usually constructed using a distance metric (e.g., Euclidean distance). $k$-nn graphs restrict the points to have a fixed number of neighbors regardless of the local statistics around them. We proposed a new SpectralNet similarity metric based on random projection trees (rpTrees). Our experiments revealed that SpectralNet produces better clustering accuracy using rpTree similarity metric compared to $k$-nn graph with a distance metric. Also, we found out that rpTree parameters do not affect the clustering accuracy. These parameters include the leaf size and the selection of projection direction. It is computationally efficient to keep the leaf size in order of $\log(n)$, and project the points onto a random direction instead of trying to find the direction with the maximum dispersion.