LGSESYJun 2, 2023

Exploring Robustness of Image Recognition Models on Hardware Accelerators

arXiv:2306.01697v62 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the reliability of AI in safety-critical applications by testing compilers and hardware accelerators, though it is incremental as it builds on existing testing methods.

The authors tackled the problem of verifying the robustness of image recognition models on hardware accelerators by developing MutateNN, a tool that uses differential and mutation testing, and found discrepancies of up to 90.3% in model outputs and performance drops or crashes of up to 99.8% across different devices.

As the usage of Artificial Intelligence (AI) on resource-intensive and safety-critical tasks increases, a variety of Machine Learning (ML) compilers have been developed, enabling compatibility of Deep Neural Networks (DNNs) with a variety of hardware acceleration devices. However, given that DNNs are widely utilized for challenging and demanding tasks, the behavior of these compilers must be verified. To this direction, we propose MutateNN, a tool that utilizes elements of both differential and mutation testing in order to examine the robustness of image recognition models when deployed on hardware accelerators with different capabilities, in the presence of faults in their target device code - introduced either by developers, or problems in their compilation process. We focus on the image recognition domain by applying mutation testing to 7 well-established DNN models, introducing 21 mutations of 6 different categories. We deployed our mutants on 4 different hardware acceleration devices of varying capabilities and observed that DNN models presented discrepancies of up to 90.3% in mutants related to conditional operators across devices. We also observed that mutations related to layer modification, arithmetic types and input affected severely the overall model performance (up to 99.8%) or led to model crashes, in a consistent manner across devices.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes