Generating Landmark Navigation Instructions from Maps as a Graph-to-Text Problem
This work provides a method for generating more natural and human-centric navigation instructions, which could benefit users who prefer landmark-based guidance over traditional turn-by-turn directions.
This paper addresses the problem of generating human-like navigation instructions using landmarks instead of street names and distances. It introduces a neural model that converts OpenStreetMap data, encoded as a location- and rotation-invariant graph, into natural language instructions, achieving successful human navigation in Street View.
Car-focused navigation services are based on turns and distances of named streets, whereas navigation instructions naturally used by humans are centered around physical objects called landmarks. We present a neural model that takes OpenStreetMap representations as input and learns to generate navigation instructions that contain visible and salient landmarks from human natural language instructions. Routes on the map are encoded in a location- and rotation-invariant graph representation that is decoded into natural language instructions. Our work is based on a novel dataset of 7,672 crowd-sourced instances that have been verified by human navigation in Street View. Our evaluation shows that the navigation instructions generated by our system have similar properties as human-generated instructions, and lead to successful human navigation in Street View.