Towards Geocoding Spatial Expressions
This addresses a domain-specific problem for natural language processing and geographic information systems, focusing on incremental improvements in handling ambiguous spatial references.
The paper tackles the challenge of geocoding imprecise spatial expressions in English text, such as 'north of Dayton, OH', by proposing a formal representation using background knowledge, semantic approximations, and fuzzy linguistic variables. It also discusses an evaluation technique based on human contextualized judgment.
Imprecise composite location references formed using ad hoc spatial expressions in English text makes the geocoding task challenging for both inference and evaluation. Typically such spatial expressions fill in unestablished areas with new toponyms for finer spatial referents. For example, the spatial extent of the ad hoc spatial expression "north of" or "50 minutes away from" in relation to the toponym "Dayton, OH" refers to an ambiguous, imprecise area, requiring translation from this qualitative representation to a quantitative one with precise semantics using systems such as WGS84. Here we highlight the challenges of geocoding such referents and propose a formal representation that employs background knowledge, semantic approximations and rules, and fuzzy linguistic variables. We also discuss an appropriate evaluation technique for the task that is based on human contextualized and subjective judgment.