Assessing AI Utility: The Random Guesser Test for Sequential Decision-Making Systems
This provides a tool for evaluating AI utility in sequential decision-making, potentially addressing issues in systems like modern recommenders, though it is incremental as it builds on existing evaluation principles.
The authors tackled the problem of assessing AI system risk and vulnerability to biased decisions by proposing a 'random guesser test' that requires AI algorithms to outperform random guessing, and found in a roulette game scenario that sophisticated AI often underperforms random guessing by a significant margin.
We propose a general approach to quantitatively assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. This may appear trivial, but empirical results from a simplistic sequential decision-making scenario involving roulette games show that sophisticated AI-based approaches often underperform the random guesser by a significant margin. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options. We argue that this "random guesser test" can serve as a useful tool for evaluating the utility of AI actions, and also points towards increasing exploration as a potential improvement to such systems.