CL CR DB LG SENov 28, 2022

On the Security Vulnerabilities of Text-to-SQL Models

Xutan Peng, Yipeng Zhang, Jingfeng Yang, Mark Stevenson

Amazon

arXiv:2211.15363v41.411 citationsh-index: 47Has Code

Originality Incremental advance

AI Analysis

This highlights potential software security threats, such as data breaches and Denial of Service attacks, for users of NLP-based database interfaces, though it is incremental in demonstrating known vulnerabilities in a new context.

The paper tackled the security vulnerabilities of Text-to-SQL models by showing that six commercial applications and four open-source models can be manipulated to produce malicious code, with backdoor attacks achieving a 100% success rate without performance loss.

Although it has been demonstrated that Natural Language Processing (NLP) algorithms are vulnerable to deliberate attacks, the question of whether such weaknesses can lead to software security threats is under-explored. To bridge this gap, we conducted vulnerability tests on Text-to-SQL systems that are commonly used to create natural language interfaces to databases. We showed that the Text-to-SQL modules within six commercial applications can be manipulated to produce malicious code, potentially leading to data breaches and Denial of Service attacks. This is the first demonstration that NLP models can be exploited as attack vectors in the wild. In addition, experiments using four open-source language models verified that straightforward backdoor attacks on Text-to-SQL systems achieve a 100% success rate without affecting their performance. The aim of this work is to draw the community's attention to potential software security issues associated with NLP algorithms and encourage exploration of methods to mitigate against them.

View on arXiv PDF

Similar