CYLGSep 20, 2024

PyGRF: An improved Python Geographical Random Forest model and case studies in public health and natural disasters

arXiv:2409.13947v121 citationsh-index: 7
Originality Synthesis-oriented
AI Analysis

This work provides a Python-based tool for spatial machine learning, addressing adoption barriers for practitioners who prefer Python, though it is incremental as it builds on an existing GRF model.

The authors tackled limitations in the existing Geographical Random Forest (GRF) model, such as hyperparameter determination and lack of a Python version, by introducing theory-informed improvements and developing PyGRF, resulting in a Python package with demonstrated applications in public health and natural disasters.

Geographical random forest (GRF) is a recently developed and spatially explicit machine learning model. With the ability to provide more accurate predictions and local interpretations, GRF has already been used in many studies. The current GRF model, however, has limitations in its determination of the local model weight and bandwidth hyperparameters, potentially insufficient numbers of local training samples, and sometimes high local prediction errors. Also, implemented as an R package, GRF currently does not have a Python version which limits its adoption among machine learning practitioners who prefer Python. This work addresses these limitations by introducing theory-informed hyperparameter determination, local training sample expansion, and spatially-weighted local prediction. We also develop a Python-based GRF model and package, PyGRF, to facilitate the use of the model. We evaluate the performance of PyGRF on an example dataset and further demonstrate its use in two case studies in public health and natural disasters.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes