LGMLJul 11, 2023

Random-Set Neural Networks (RS-NN)

Oxford
arXiv:2307.05772v517 citationsh-index: 29
Originality Highly original
AI Analysis

This addresses the need for reliable uncertainty awareness in safety-critical domains, offering a novel method that improves over existing techniques, though it builds on established concepts like belief functions and random sets.

The paper tackles the problem of uncertainty estimation in safety-critical machine learning by proposing Random-Set Neural Networks (RS-NN), which predict belief functions instead of probability vectors to encode epistemic uncertainty from limited training data. The approach outperforms state-of-the-art Bayesian and ensemble methods in accuracy, uncertainty estimation, and out-of-distribution detection across multiple benchmarks, such as CIFAR-10 vs SVHN and ImageNet vs ImageNet-O, and scales effectively to large architectures like ViT-Base-16.

Machine learning is increasingly deployed in safety-critical domains where erroneous predictions may lead to potentially catastrophic consequences, highlighting the need for learning systems to be aware of how confident they are in their own predictions: in other words, 'to know when they do not know'. In this paper, we propose a novel Random-Set Neural Network (RS-NN) approach to classification which predicts belief functions (rather than classical probability vectors) over the class list using the mathematics of random sets, i.e., distributions over the collection of sets of classes. RS-NN encodes the 'epistemic' uncertainty induced by training sets that are insufficiently representative or limited in size via the size of the convex set of probability vectors associated with a predicted belief function. Our approach outperforms state-of-the-art Bayesian and Ensemble methods in terms of accuracy, uncertainty estimation and out-of-distribution (OoD) detection on multiple benchmarks (CIFAR-10 vs SVHN/Intel-Image, MNIST vs FMNIST/KMNIST, ImageNet vs ImageNet-O). RS-NN also scales up effectively to large-scale architectures (e.g. WideResNet-28-10, VGG16, Inception V3, EfficientNetB2 and ViT-Base-16), exhibits remarkable robustness to adversarial attacks and can provide statistical guarantees in a conformal learning setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes