Translate your gibberish: black-box adversarial attack on machine translation systems
This exposes a security flaw in widely used translation tools that could degrade user experience and understanding, though it is incremental as it builds on existing adversarial attack research.
The authors tackled the vulnerability of state-of-the-art machine translation systems to adversarial attacks, showing that their black-box gradient-free optimizer can cause tools like Google, DeepL, and Yandex to produce wrong or offensive translations or refuse to translate benign inputs.
Neural networks are deployed widely in natural language processing tasks on the industrial scale, and perhaps the most often they are used as compounds of automatic machine translation systems. In this work, we present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa. Using a novel black-box gradient-free tensor-based optimizer, we show that many online translation tools, such as Google, DeepL, and Yandex, may both produce wrong or offensive translations for nonsensical adversarial input queries and refuse to translate seemingly benign input phrases. This vulnerability may interfere with understanding a new language and simply worsen the user's experience while using machine translation systems, and, hence, additional improvements of these tools are required to establish better translation.