CL AIMay 2, 2023

Sentiment Perception Adversarial Attacks on Neural Machine Translation Systems

arXiv:2305.01437v20.91 citations

Originality Incremental advance

AI Analysis

This addresses a security vulnerability in NMT systems, particularly for applications like translated reviews, but is incremental as it builds on prior work on adversarial attacks in NMT.

The paper tackles the problem of adversarial attacks on Neural Machine Translation (NMT) systems by focusing on changing the sentiment perception of output sequences without altering input perception, demonstrating that small imperceptible changes to inputs can significantly alter sentiment in outputs.

With the advent of deep learning methods, Neural Machine Translation (NMT) systems have become increasingly powerful. However, deep learning based systems are susceptible to adversarial attacks, where imperceptible changes to the input can cause undesirable changes at the output of the system. To date there has been little work investigating adversarial attacks on sequence-to-sequence systems, such as NMT models. Previous work in NMT has examined attacks with the aim of introducing target phrases in the output sequence. In this work, adversarial attacks for NMT systems are explored from an output perception perspective. Thus the aim of an attack is to change the perception of the output sequence, without altering the perception of the input sequence. For example, an adversary may distort the sentiment of translated reviews to have an exaggerated positive sentiment. In practice it is challenging to run extensive human perception experiments, so a proxy deep-learning classifier applied to the NMT output is used to measure perception changes. Experiments demonstrate that the sentiment perception of NMT systems' output sequences can be changed significantly with small imperceptible changes to input sequences.

View on arXiv PDF

Similar