LG AI SPNov 19, 2024

Finding One's Bearings in the Hyperparameter Landscape of a Wide-Kernel Convolutional Fault Detector

Dan Hudson, Jurgen van den Hoogen, Martin Atzmueller

arXiv:2411.15191v22.6h-index: 5IEEE Access

Originality Incremental advance

AI Analysis

This work addresses hyperparameter tuning for domain-specific fault detection, but it is incremental as it builds on existing guidance by focusing on architecture-specific parameters.

The paper tackles the problem of hyperparameter sensitivity in wide-kernel convolutional neural networks for bearing fault detection, showing that kernel size in the first layer is particularly sensitive to data changes, with findings based on seven benchmark datasets and manipulated data copies.

State-of-the-art algorithms are reported to be almost perfect at distinguishing the vibrations arising from healthy and damaged machine bearings, according to benchmark datasets at least. However, what about their application to new data? In this paper, we confirm that neural networks for bearing fault detection can be crippled by incorrect hyperparameterisation, and also that the correct hyperparameter settings can change when transitioning to new data. The paper combines multiple methods to explain the behaviour of the hyperparameters of a wide-kernel convolutional neural network and how to set them. Since guidance already exists for generic hyperparameters like minibatch size, we focus on how to set architecture-specific hyperparameters such as the width of the convolutional kernels, a topic which might otherwise be obscure. We reflect different data properties by fusing information from seven different benchmark datasets, and our results show that the kernel size in the first layer in particular is sensitive to changes in the data. Looking deeper, we use manipulated copies of one dataset in an attempt to spot why the kernel size sometimes needs to change. The relevance of sampling rate is studied by using different levels of resampling, and spectral content is studied by increasingly filtering out high frequencies. We find that, contrary to speculation in earlier work, high-frequency noise is not the main reason why a wide kernel is preferable to a narrow kernel. Finally, we conclude by stating clear guidance on how to set the hyperparameters of our neural network architecture to work effectively on new data.

View on arXiv PDF

Similar