CLJan 11, 2014

The semantic similarity ensemble

arXiv:1401.2517v115 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem in geographic information retrieval and integration by offering an incremental improvement through ensemble methods.

The paper tackles the challenge of selecting the most appropriate computational measure for geo-semantic similarity by proposing a semantic similarity ensemble (SSE) that combines multiple measures, and results show it outperforms the average of its parts and provides a more cognitively plausible approach when the best measure is unknown.

Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE) as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes