One-Way Matching of Datasets with Low Rank Signals
This work addresses dataset matching in computational biology, specifically for single-cell data, but appears incremental as it builds on existing low-rank and assignment methods.
The paper tackles the problem of one-way matching for datasets with low-rank signals by deriving information-theoretic limits and proposing a linear assignment method with projected data, achieving fast convergence rates and sometimes minimax optimality, as supported by simulated examples and applied to single-cell data.
We study one-way matching of a pair of datasets with low rank signals. Under a stylized model, we first derive information-theoretic limits of matching under a mismatch proportion loss. We then show that linear assignment with projected data achieves fast rates of convergence and sometimes even minimax rate optimality for this task. The theoretical error bounds are corroborated by simulated examples. Furthermore, we illustrate practical use of the matching procedure on two single-cell data examples.