Robustness Evaluation of Regression Tasks with Skewed Domain Preferences
This work addresses evaluation challenges in scientific domains like weather and health where predicting rare, extreme events is critical, but it is incremental in refining existing robustness assessment techniques.
The paper tackles the problem of evaluating regression models in domains with non-uniform preferences for prediction accuracy, particularly focusing on extreme values, and assesses model robustness under distributional uncertainty. It demonstrates how varying relevance levels of target values affect experimental conclusions and shows the practical utility of the proposed methods.
In natural phenomena, data distributions often deviate from normality. One can think of cataclysms as a self-explanatory example: events that occur almost never, and at the same time are many standard deviations away from the common outcome. In many scientific contexts it is exactly these tail events that researchers are most interested in anticipating, so that adequate measures can be taken to prevent or attenuate a major impact on society. Despite such efforts, we have yet to provide definite answers to crucial issues in evaluating predictive solutions in domains such as weather, pollution, health. In this paper, we deal with two encapsulated problems simultaneously. First, assessing the performance of regression models when non-uniform preferences apply - not all values are equally relevant concerning the accuracy of their prediction, and there's a particular interest in the most extreme values. Second, assessing the robustness of models when dealing with uncertainty regarding the actual underlying distribution of values relevant for such problems. We show how different levels of relevance associated with target values may impact experimental conclusions, and demonstrate the practical utility of the proposed methods.