CLLGDec 7, 2021

Multinational Address Parsing: A Zero-Shot Evaluation

arXiv:2112.04008v13 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of parsing addresses from diverse countries for applications like record linkage, though it is incremental as it builds on existing neural network methods.

The paper tackles the problem of address parsing across multiple countries without additional training, achieving state-of-the-art performance for most tested countries in a zero-shot transfer learning setting.

Address parsing consists of identifying the segments that make up an address, such as a street name or a postal code. Because of its importance for tasks like record linkage, address parsing has been approached with many techniques, the latest relying on neural networks. While these models yield notable results, previous work on neural networks has only focused on parsing addresses from a single source country. This paper explores the possibility of transferring the address parsing knowledge acquired by training deep learning models on some countries' addresses to others with no further training in a zero-shot transfer learning setting. We also experiment using an attention mechanism and a domain adversarial training algorithm in the same zero-shot transfer setting to improve performance. Both methods yield state-of-the-art performance for most of the tested countries while giving good results to the remaining countries. We also explore the effect of incomplete addresses on our best model, and we evaluate the impact of using incomplete addresses during training. In addition, we propose an open-source Python implementation of some of our trained models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes