LG ST MLDec 22, 2022

Model Based Co-clustering of Mixed Numerical and Binary Data

Aichetou Bouchareb, Marc Boullé, Fabrice Clérot, Fabrice Rossi

arXiv:2212.11725v11.8h-index: 26

Originality Incremental advance

AI Analysis

This work addresses a gap in data mining for researchers and practitioners dealing with heterogeneous datasets, but it appears incremental as it builds on existing latent block models.

The paper tackled the problem of co-clustering mixed numerical and binary data, which had received little prior attention, by extending latent block models to handle both data types and evaluating the approach on simulated data.

Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or contingency tables. However, very little work has been done to perform co-clustering on mixed type data. In this article, we extend the latent block models based co-clustering to the case of mixed data (continuous and binary variables). We then evaluate the effectiveness of the proposed approach on simulated data and we discuss its advantages and potential limits.

View on arXiv PDF

Similar