Multi-Point Proximity Encoding For Vector-Mode Geospatial Machine Learning
This provides a more effective encoding method for researchers and practitioners in geospatial machine learning, though it appears incremental as it builds on existing encoding approaches.
The paper tackles the problem of encoding vector-mode geospatial data (points, lines, polygons) for use in machine learning by introducing MultiPoint Proximity (MPP) encoding, which uses scaled distances to reference points and outperforms rasterization-based methods in capturing geometric features and spatial relationships.
Vector-mode geospatial data -- points, lines, and polygons -- must be encoded into an appropriate form in order to be used with traditional machine learning and artificial intelligence models. Encoding methods attempt to represent a given shape as a vector that captures its essential geometric properties. This paper presents an encoding method based on scaled distances from a shape to a set of reference points within a region of interest. The method, MultiPoint Proximity (MPP) encoding, can be applied to any type of shape, enabling the parameterization of machine learning models with encoded representations of vector-mode geospatial features. We show that MPP encoding possesses the desirable properties of shape-centricity and continuity, can be used to differentiate spatial objects based on their geometric features, and can capture pairwise spatial relationships with high precision. In all cases, MPP encoding is shown to perform better than an alternative method based on rasterization.