LG CV GNJul 9, 2021

A Topological-Framework to Improve Analysis of Machine Learning Model Performance

Henry Kvinge, Colby Wight, Sarah Akers, Scott Howland, Woongjo Choi, Xiaolong Ma, Luke Gosink, Elizabeth Jurrus, Keerti Kappagantula, Tegan H. Emerson

arXiv:2107.04714v11.6

Originality Incremental advance

AI Analysis

This addresses the need for better model evaluation in real-world scenarios where failure on specific subpopulations is critical, though it appears incremental as it builds on existing topological concepts.

The paper tackles the problem of understanding machine learning model performance beyond summary statistics by proposing a topological framework that treats datasets as spaces, enabling analysis at both global and local levels.

As both machine learning models and the datasets on which they are evaluated have grown in size and complexity, the practice of using a few summary statistics to understand model performance has become increasingly problematic. This is particularly true in real-world scenarios where understanding model failure on certain subpopulations of the data is of critical importance. In this paper we propose a topological framework for evaluating machine learning models in which a dataset is treated as a "space" on which a model operates. This provides us with a principled way to organize information about model performance at both the global level (over the entire test set) and also the local level (on specific subpopulations). Finally, we describe a topological data structure, presheaves, which offer a convenient way to store and analyze model performance between different subpopulations.

View on arXiv PDF

Similar