LGAIHCDec 8, 2023

Conformal Prediction in Multi-User Settings: An Evaluation

arXiv:2312.05195v11 citationsh-index: 18User modeling and user-adapted interaction
Originality Synthesis-oriented
AI Analysis

This work addresses the need for reliable uncertainty quantification in multi-user systems, such as user-computer interaction and medical applications, but it is incremental as it applies an existing method to new scenarios.

The paper tackled the problem of inaccurate performance metrics in multi-user machine learning settings by evaluating the conformal prediction framework, which provides confidence guarantees on predictions, and found significant differences in conformal performance measures across various evaluation strategies.

Typically, machine learning models are trained and evaluated without making any distinction between users (e.g, using traditional hold-out and cross-validation). However, this produces inaccurate performance metrics estimates in multi-user settings. That is, situations where the data were collected by multiple users with different characteristics (e.g., age, gender, height, etc.) which is very common in user computer interaction and medical applications. For these types of scenarios model evaluation strategies that provide better performance estimates have been proposed such as mixed, user-independent, user-dependent, and user-adaptive models. Although those strategies are better suited for multi-user systems, they are typically assessed with respect to performance metrics that capture the overall behavior of the models and do not provide any performance guarantees for individual predictions nor they provide any feedback about the predictions' uncertainty. In order to overcome those limitations, in this work we evaluated the conformal prediction framework in several multi-user settings. Conformal prediction is a model agnostic method that provides confidence guarantees on the predictions, thus, increasing the trustworthiness and robustness of the models. We conducted extensive experiments using different evaluation strategies and found significant differences in terms of conformal performance measures. We also proposed several visualizations based on matrices, graphs, and charts that capture different aspects of the resulting prediction sets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes