Directionally Dependent Multi-View Clustering Using Copula Model
This addresses a domain-specific problem for genomics researchers by enabling more accurate integrative clustering of multi-source data with directional dependencies, though it is incremental as it builds on existing copula methods.
The authors tackled the problem of multi-view clustering in genomics by proposing a copula-based model that accommodates directional dependencies, such as those in DNA expression, DNA methylation, and RNA expression, and applied it to breast cancer tumor samples from TCGA, showing that ignoring directional dependence negatively affects clustering performance in simulations.
In recent biomedical scientific problems, it is a fundamental issue to integratively cluster a set of objects from multiple sources of datasets. Such problems are mostly encountered in genomics, where data is collected from various sources, and typically represent distinct yet complementary information. Integrating these data sources for multi-source clustering is challenging due to their complex dependence structure including directional dependency. Particularly in genomics studies, it is known that there is certain directional dependence between DNA expression, DNA methylation, and RNA expression, widely called The Central Dogma. Most of the existing multi-view clustering methods either assume an independent structure or pair-wise (non-directional) dependency, thereby ignoring the directional relationship. Motivated by this, we propose a copula-based multi-view clustering model where a copula enables the model to accommodate the directional dependence existing in the datasets. We conduct a simulation experiment where the simulated datasets exhibiting inherent directional dependence: it turns out that ignoring the directional dependence negatively affects the clustering performance. As a real application, we applied our model to the breast cancer tumor samples collected from The Cancer Genome Altas (TCGA).