Clustering as an Evaluation Protocol for Knowledge Embedding Representation of Categorised Multi-relational Data in the Clinical Domain
This addresses the issue of information loss in evaluation for clinical domain machine learning, though it appears incremental as it focuses on improving an existing protocol rather than introducing a new method.
The paper tackled the problem of evaluating knowledge embedding representations for categorized multi-relational clinical data, proposing a clustering evaluation protocol as an alternative to traditional link prediction, with experimental results showing strong correlations (Pearson and Spearman) indicating its potential to replace link prediction.
Learning knowledge representation is an increasingly important technology applicable in many domain-specific machine learning problems. We discuss the effectiveness of traditional Link Prediction or Knowledge Graph Completion evaluation protocol when embedding knowledge representation for categorised multi-relational data in the clinical domain. Link prediction uses to split the data into training and evaluation subsets, leading to loss of information along training and harming the knowledge representation model accuracy. We propose a Clustering Evaluation Protocol as a replacement alternative to the traditionally used evaluation tasks. We used embedding models trained by a knowledge embedding approach which has been evaluated with clinical datasets. Experimental results with Pearson and Spearman correlations show strong evidence that the novel proposed evaluation protocol is pottentially able to replace link prediction.