CY AI LGMar 10, 2021

Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and Tradeoffs

Solon Barocas, Anhong Guo, Ece Kamar, Jacquelyn Krones, Meredith Ringel Morris, Jennifer Wortman Vaughan, Duncan Wadsworth, Hanna Wallach

arXiv:2103.06076v220.093 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of ensuring fair and interpretable AI evaluations for researchers and practitioners, though it is incremental as it builds on existing concepts without introducing new methods.

The paper tackles the design challenges of disaggregated evaluations for AI systems, where performance is reported separately for different groups, by analyzing how various design choices affect results and impacts, and it argues for better documentation to aid interpretation.

Disaggregated evaluations of AI systems, in which system performance is assessed and reported separately for different groups of people, are conceptually simple. However, their design involves a variety of choices. Some of these choices influence the results that will be obtained, and thus the conclusions that can be drawn; others influence the impacts -- both beneficial and harmful -- that a disaggregated evaluation will have on people, including the people whose data is used to conduct the evaluation. We argue that a deeper understanding of these choices will enable researchers and practitioners to design careful and conclusive disaggregated evaluations. We also argue that better documentation of these choices, along with the underlying considerations and tradeoffs that have been made, will help others when interpreting an evaluation's results and conclusions.

View on arXiv PDF

Similar