Stéphane Mussard

2papers

2 Papers

2.5LGMay 24
Optimizing Multidimensional Scaling in Gini Metric Spaces

Cassandra Mussard, Stéphane Mussard

The Gini Multidimensional Scaling (Gini MDS) framework extends the Euclidean multidimensional scaling. We introduce a Gini pseudo-distance based on values and their ranks that depends on a fine-tunable hyperparameter. This pseudo-distance allows flexible exploration of latent configurations, enabling embeddings that best match observed dissimilarities. The Gini MDS is shown to be robust to noise and outliers, making it well-suited for real-world applications. We provide experiments on 16 UCI datasets with outliers and on MNIST images with noise to show that the Gini MDS outperforms the Euclidean MDS on noisy data. Finally, a tensor-based implementation in \texttt{PyTorch} provides GPU acceleration and efficient computation compared to the standard MDS of the \texttt{sklearn} library.

LGJan 29, 2025
KNN and K-means in Gini Prametric Spaces

Cassandra Mussard, Arthur Charpentier, Stéphane Mussard

This paper introduces enhancements to the K-means and K-nearest neighbors (KNN) algorithms based on the concept of Gini prametric spaces, instead of traditional metric spaces. Unlike standard distance metrics, Gini prametrics incorporate both value-based and rank-based measures, offering robustness to noise and outliers. The main contributions include: (1) a Gini prametric that captures rank information alongside value distances; (2) a Gini K-means algorithm that is provably convergent and resilient to noisy data; and (3) a Gini KNN method that performs competitively with state-of-the-art approaches like Hassanat's distance in noisy environments. Experimental evaluations on 16 UCI datasets demonstrate the superior performance and efficiency of the Gini-based algorithms in clustering and classification tasks. This work opens new directions for rank-based prametrics in machine learning and statistical analysis.