ML LGSep 8, 2023

Probabilistic Safety Regions Via Finite Families of Scalable Classifiers

Alberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene, Maurizio Mongelli

arXiv:2309.04627v15.92 citationsh-index: 5

Originality Incremental advance

AI Analysis

This work addresses the need for theoretical foundations to provide probabilistic certifications for classifiers, which is an incremental improvement in error control for data analysts and machine learning practitioners.

The paper tackles the problem of misclassification errors in supervised classification by introducing probabilistic safety regions to control misclassification rates probabilistically, and demonstrates the approach through synthetic data and a smart mobility application.

Supervised classification recognizes patterns in the data to separate classes of behaviours. Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning. The data analyst may minimize the classification error on a class at the expense of increasing the error of the other classes. The error control of such a design phase is often done in a heuristic manner. In this context, it is key to develop theoretical foundations capable of providing probabilistic certifications to the obtained classifiers. In this perspective, we introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled. The notion of scalable classifiers is then exploited to link the tuning of machine learning with error control. Several tests corroborate the approach. They are provided through synthetic data in order to highlight all the steps involved, as well as through a smart mobility application.

View on arXiv PDF

Similar