DBCLApr 26, 2023

Towards Multi-Modal DBMSs for Seamless Querying of Texts and Tables

arXiv:2304.13559v28 citationsh-index: 7
Originality Highly original
AI Analysis

This addresses the challenge of integrating textual data into database queries for users of database systems, representing a novel method rather than an incremental improvement.

The paper tackles the problem of querying text and tables together by proposing Multi-Modal Databases (MMDBs) that use multi-modal operators based on large language models like GPT-3, resulting in a prototype that outperforms state-of-the-art approaches in accuracy and performance while requiring less training data.

In this paper, we propose Multi-Modal Databases (MMDBs), which is a new class of database systems that can seamlessly query text and tables using SQL. To enable seamless querying of textual data using SQL in an MMDB, we propose to extend relational databases with so-called multi-modal operators (MMOps) which are based on the advances of recent large language models such as GPT-3. The main idea of MMOps is that they allow text collections to be treated as tables without the need to manually transform the data. As we show in our evaluation, our MMDB prototype can not only outperform state-of-the-art approaches such as text-to-table in terms of accuracy and performance but it also requires significantly less training data to fine-tune the model for an unseen text collection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes