CV BMMar 24, 2025

Clustering data by reordering them

Axel Descamps, Sélène Forget, Aliénor Lahlou, Claire Lavergne, Camille Berthelot, Guillaume Stirnemann, Rodolphe Vuilleumier, Nicolas Chéron

arXiv:2503.19067v12 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This is an incremental method for clustering in various scientific domains, such as biomolecules and images.

The paper tackles the problem of clustering data by reordering elements based on similarity and dissimilarity, resulting in an algorithm that automatically performs analysis with understandable parameters and handles noise explicitly.

Grouping elements into families to analyse them separately is a standard analysis procedure in many areas of sciences. We propose herein a new algorithm based on the simple idea that members from a family look like each other, and don't resemble elements foreign to the family. After reordering the data according to the distance between elements, the analysis is automatically performed with easily-understandable parameters. Noise is explicitly taken into account to deal with the variety of problems of a data-driven world. We applied the algorithm to sort biomolecules conformations, gene sequences, cells, images, and experimental conditions.

View on arXiv PDF

Similar