Peipei Jiang

CROct 19, 2021

Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information

Baolin Zheng, Peipei Jiang, Qian Wang et al.

Adversarial attacks against commercial black-box speech platforms, including cloud speech APIs and voice control devices, have received little attention until recent years. The current "black-box" attacks all heavily rely on the knowledge of prediction/confidence scores to craft effective adversarial examples, which can be intuitively defended by service providers without returning these messages. In this paper, we propose two novel adversarial attacks in more practical and rigorous scenarios. For commercial cloud speech APIs, we propose Occam, a decision-only black-box adversarial attack, where only final decisions are available to the adversary. In Occam, we formulate the decision-only AE generation as a discontinuous large-scale global optimization problem, and solve it by adaptively decomposing this complicated problem into a set of sub-problems and cooperatively optimizing each one. Our Occam is a one-size-fits-all approach, which achieves 100% success rates of attacks with an average SNR of 14.23dB, on a wide range of popular speech and speaker recognition APIs, including Google, Alibaba, Microsoft, Tencent, iFlytek, and Jingdong, outperforming the state-of-the-art black-box attacks. For commercial voice control devices, we propose NI-Occam, the first non-interactive physical adversarial attack, where the adversary does not need to query the oracle and has no access to its internal information and training data. We combine adversarial attacks with model inversion attacks, and thus generate the physically-effective audio AEs with high transferability without any interaction with target devices. Our experimental results show that NI-Occam can successfully fool Apple Siri, Microsoft Cortana, Google Assistant, iFlytek and Amazon Echo with an average SRoA of 52% and SNR of 9.65dB, shedding light on non-interactive physical attacks against voice control devices.

CRJun 15, 2021

Securing Face Liveness Detection Using Unforgeable Lip Motion Patterns

Man Zhou, Qian Wang, Qi Li et al.

Face authentication usually utilizes deep learning models to verify users with high recognition accuracy. However, face authentication systems are vulnerable to various attacks that cheat the models by manipulating the digital counterparts of human faces. So far, lots of liveness detection schemes have been developed to prevent such attacks. Unfortunately, the attacker can still bypass these schemes by constructing wide-ranging sophisticated attacks. We study the security of existing face authentication services (e.g., Microsoft, Amazon, and Face++) and typical liveness detection approaches. Particularly, we develop a new type of attack, i.e., the low-cost 3D projection attack that projects manipulated face videos on a 3D face model, which can easily evade these face authentication services and liveness detection approaches. To this end, we propose FaceLip, a novel liveness detection scheme for face authentication, which utilizes unforgeable lip motion patterns built upon well-designed acoustic signals to enable a strong security guarantee. The unique lip motion patterns for each user are unforgeable because FaceLip verifies the patterns by capturing and analyzing the acoustic signals that are dynamically generated according to random challenges, which ensures that our signals for liveness detection cannot be manipulated. Specially, we develop robust algorithms for FaceLip to eliminate the impact of noisy signals in the environment and thus can accurately infer the lip motions at larger distances. We prototype FaceLip on off-the-shelf smartphones and conduct extensive experiments under different settings. Our evaluation with 44 participants validates the effectiveness and robustness of FaceLip.

Peipei Jiang

2 Papers