AIFeb 23, 2023

A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries

Prabhakar Kudva, Rajesh Bordawekar, Apoorva Nitsure

arXiv:2302.12178v32.1h-index: 28

Originality Incremental advance

AI Analysis

This work addresses the need for transparent insights into semantic query results for database users, though it is incremental as it builds on existing embedding-based query systems.

The paper tackles the problem of providing interpretability for semantic SQL queries in AI-powered databases by introducing a scalable, space-efficient in-database framework that uses a probabilistic sketch to store co-occurrence counts, achieving up to 8x space savings while maintaining interpretability quality.

AI-Powered database (AI-DB) is a novel relational database system that uses a self-supervised neural network, database embedding, to enable semantic SQL queries on relational tables. In this paper, we describe an architecture and implementation of in-database interpretability infrastructure designed to provide simple, transparent, and relatable insights into ranked results of semantic SQL queries supported by AI-DB. We introduce a new co-occurrence based interpretability approach to capture relationships between relational entities and describe a space-efficient probabilistic Sketch implementation to store and process co-occurrence counts. Our approach provides both query-agnostic (global) and query-specific (local) interpretabilities. Experimental evaluation demonstrate that our in-database probabilistic approach provides the same interpretability quality as the precise space-inefficient approach, while providing scalable and space efficient runtime behavior (up to 8X space savings), without any user intervention.

View on arXiv PDF

Similar