LG AI CL IT MLSep 9, 2019

Nearly-Unsupervised Hashcode Representations for Relation Extraction

Sahil Garg, Aram Galstyan, Greg Ver Steeg, Guillermo Cecchi

arXiv:1909.03881v11.81 citations

Originality Incremental advance

AI Analysis

This work addresses biomedical relation extraction, offering a more efficient method for building generalizable hashcode representations, though it is incremental as it builds on prior kernelized hashcode techniques.

The paper tackles the problem of optimizing hashcode representations for biomedical relation extraction by using a nearly unsupervised approach that only requires data points without class labels, resulting in significant accuracy improvements compared to state-of-the-art supervised and semi-supervised methods.

Recently, kernelized locality sensitive hashcodes have been successfully employed as representations of natural language text, especially showing high relevance to biomedical relation extraction tasks. In this paper, we propose to optimize the hashcode representations in a nearly unsupervised manner, in which we only use data points, but not their class labels, for learning. The optimized hashcode representations are then fed to a supervised classifier following the prior work. This nearly unsupervised approach allows fine-grained optimization of each hash function, which is particularly suitable for building hashcode representations generalizing from a training set to a test set. We empirically evaluate the proposed approach for biomedical relation extraction tasks, obtaining significant accuracy improvements w.r.t. state-of-the-art supervised and semi-supervised approaches.

View on arXiv PDF

Similar