Clustering-Based Relational Unsupervised Representation Learning with an Explicit Distributed Representation
This work addresses the challenge of handling relational data in unsupervised learning, which is incremental as it extends existing methods to a new data type.
The paper tackles the problem of unsupervised representation learning for relational data by introducing a clustering-based method that views datasets as hypergraphs and extracts features from vertex and hyperedge clusters. The result shows that models using these latent representations achieve better performance, lower complexity, and outperform existing approaches on classification tasks.
The goal of unsupervised representation learning is to extract a new representation of data, such that solving many different tasks becomes easier. Existing methods typically focus on vectorized data and offer little support for relational data, which additionally describe relationships among instances. In this work we introduce an approach for relational unsupervised representation learning. Viewing a relational dataset as a hypergraph, new features are obtained by clustering vertices and hyperedges. To find a representation suited for many relational learning tasks, a wide range of similarities between relational objects is considered, e.g. feature and structural similarities. We experimentally evaluate the proposed approach and show that models learned on such latent representations perform better, have lower complexity, and outperform the existing approaches on classification tasks.