CLJul 8, 2024

MST5 -- Multilingual Question Answering over Knowledge Graphs

arXiv:2407.06041v17 citationsh-index: 50
Originality Incremental advance
AI Analysis

This work addresses the disadvantage faced by non-English speakers in accessing knowledge graphs, though it appears incremental as it builds on existing multilingual transformer models.

The paper tackles the problem of multilingual knowledge graph question answering (KGQA), where non-English systems underperform compared to English ones, by proposing a simplified approach that integrates linguistic context and entity information directly into a single multilingual transformer model. The method shows promising results on QALD-9-Plus and QALD-10 datasets and is evaluated on Chinese and Japanese, expanding language diversity.

Knowledge Graph Question Answering (KGQA) simplifies querying vast amounts of knowledge stored in a graph-based model using natural language. However, the research has largely concentrated on English, putting non-English speakers at a disadvantage. Meanwhile, existing multilingual KGQA systems face challenges in achieving performance comparable to English systems, highlighting the difficulty of generating SPARQL queries from diverse languages. In this research, we propose a simplified approach to enhance multilingual KGQA systems by incorporating linguistic context and entity information directly into the processing pipeline of a language model. Unlike existing methods that rely on separate encoders for integrating auxiliary information, our strategy leverages a single, pretrained multilingual transformer-based language model to manage both the primary input and the auxiliary data. Our methodology significantly improves the language model's ability to accurately convert a natural language query into a relevant SPARQL query. It demonstrates promising results on the most recent QALD datasets, namely QALD-9-Plus and QALD-10. Furthermore, we introduce and evaluate our approach on Chinese and Japanese, thereby expanding the language diversity of the existing datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes