DBLGMLJun 26, 2018

EmbNum: Semantic labeling for numerical values with deep metric learning

arXiv:1807.01367v22 citations
AI Analysis

This addresses the challenge of assigning semantic labels to numerical attributes in open data where distribution assumptions are inappropriate, though it appears incremental as it builds on existing retrieval-based approaches.

The paper tackles the problem of semantic labeling for numerical attributes by proposing EmbNum, a neural numerical embedding model that learns representation vectors without prior distribution assumptions, and reports that it significantly outperforms state-of-the-art methods in effectiveness and efficiency on City Data and Open Data.

Semantic labeling for numerical values is a task of assigning semantic labels to unknown numerical attributes. The semantic labels could be numerical properties in ontologies, instances in knowledge bases, or labeled data that are manually annotated by domain experts. In this paper, we refer to semantic labeling as a retrieval setting where the label of an unknown attribute is assigned by the label of the most relevant attribute in labeled data. One of the greatest challenges is that an unknown attribute rarely has the same set of values with the similar one in the labeled data. To overcome the issue, statistical interpretation of value distribution is taken into account. However, the existing studies assume a specific form of distribution. It is not appropriate in particular to apply open data where there is no knowledge of data in advance. To address these problems, we propose a neural numerical embedding model (EmbNum) to learn useful representation vectors for numerical attributes without prior assumptions on the distribution of data. Then, the "semantic similarities" between the attributes are measured on these representation vectors by the Euclidean distance. Our empirical experiments on City Data and Open Data show that EmbNum significantly outperforms state-of-the-art methods for the task of numerical attribute semantic labeling regarding effectiveness and efficiency.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes