CLOct 18, 2023

knn-seq: Efficient, Extensible kNN-MT Framework

arXiv:2310.12352v12 citationsh-index: 16Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency bottlenecks for researchers and developers using kNN-MT, offering an incremental improvement with practical open-source tools.

The paper tackles the computational inefficiency of constructing and retrieving from large datastores in k-nearest-neighbor machine translation (kNN-MT), presenting knn-seq, an efficient and extensible framework that achieves comparable translation gains to original kNN-MT while constructing a billion-scale datastore in 2.21 hours for the WMT'19 German-to-English task.

k-nearest-neighbor machine translation (kNN-MT) boosts the translation quality of a pre-trained neural machine translation (NMT) model by utilizing translation examples during decoding. Translation examples are stored in a vector database, called a datastore, which contains one entry for each target token from the parallel data it is made from. Due to its size, it is computationally expensive both to construct and to retrieve examples from the datastore. In this paper, we present an efficient and extensible kNN-MT framework, knn-seq, for researchers and developers that is carefully designed to run efficiently, even with a billion-scale large datastore. knn-seq is developed as a plug-in on fairseq and easy to switch models and kNN indexes. Experimental results show that our implemented kNN-MT achieves a comparable gain to the original kNN-MT, and the billion-scale datastore construction took 2.21 hours in the WMT'19 German-to-English translation task. We publish our knn-seq as an MIT-licensed open-source project and the code is available on https://github.com/naist-nlp/knn-seq . The demo video is available on https://youtu.be/zTDzEOq80m0 .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes