Active Seriation: Efficient Ordering Recovery with Statistical Guarantees
This addresses the seriation problem for data analysis and ranking applications, offering an incremental improvement with statistical guarantees.
The paper tackles the problem of recovering an unknown ordering of items from noisy pairwise similarity measurements, proposing an active seriation algorithm that provably recovers the ordering with high probability under a uniform separation condition, establishing optimal guarantees in error probability and observation count.
Active seriation aims at recovering an unknown ordering of $n$ items by adaptively querying pairwise similarities. The observations are noisy measurements of entries of an underlying $n$ x $n$ permuted Robinson matrix, whose permutation encodes the latent ordering. The framework allows the algorithm to start with partial information on the latent ordering, including seriation from scratch as a special case. We propose an active seriation algorithm that provably recovers the latent ordering with high probability. Under a uniform separation condition on the similarity matrix, optimal performance guarantees are established, both in terms of the probability of error and the number of observations required for successful recovery.