Shuffled linear regression through graduated convex relaxation
This work addresses a problem in applications such as survey data analysis, where preserving anonymity while uncovering statistical connections is crucial, though it appears incremental as it builds on existing methods with specific enhancements.
The paper tackles the shuffled linear regression problem, which involves recovering linear relationships when input-output correspondences are unknown, by proposing a novel optimization algorithm based on a posterior-maximizing objective with Gaussian noise prior. The results show that the approach performs competitively with existing methods on synthetic and real data, achieving empirical running-time improvements and effectively utilizing side information like seeds.
The shuffled linear regression problem aims to recover linear relationships in datasets where the correspondence between input and output is unknown. This problem arises in a wide range of applications including survey data, in which one needs to decide whether the anonymity of the responses can be preserved while uncovering significant statistical connections. In this work, we propose a novel optimization algorithm for shuffled linear regression based on a posterior-maximizing objective function assuming Gaussian noise prior. We compare and contrast our approach with existing methods on synthetic and real data. We show that our approach performs competitively while achieving empirical running-time improvements. Furthermore, we demonstrate that our algorithm is able to utilize the side information in the form of seeds, which recently came to prominence in related problems.