LG HCSep 8, 2021

Learn2Agree: Fitting with Multiple Annotators without Objective Ground Truth

Chongyang Wang, Yuan Gao, Chenyou Fan, Junjie Hu, Tin Lun Lam, Nicholas D. Lane, Nadia Bianchi-Berthouze

arXiv:2109.03596v43.11 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of developing reliable models in domains like medical diagnosis where expert annotations vary, though it is incremental as it builds on existing backbones.

The paper tackled the problem of learning from multiple annotators when objective ground truth is ambiguous, such as in medical applications like chronic disease rehabilitation, by proposing the Learn2Agree framework that integrates agreement learning to regularize classifier decisions, resulting in improved agreement levels with annotators on two medical datasets.

The annotation of domain experts is important for some medical applications where the objective ground truth is ambiguous to define, e.g., the rehabilitation for some chronic diseases, and the prescreening of some musculoskeletal abnormalities without further medical examinations. However, improper uses of the annotations may hinder developing reliable models. On one hand, forcing the use of a single ground truth generated from multiple annotations is less informative for the modeling. On the other hand, feeding the model with all the annotations without proper regularization is noisy given existing disagreements. For such issues, we propose a novel Learning to Agreement (Learn2Agree) framework to tackle the challenge of learning from multiple annotators without objective ground truth. The framework has two streams, with one stream fitting with the multiple annotators and the other stream learning agreement information between annotators. In particular, the agreement learning stream produces regularization information to the classifier stream, tuning its decision to be better in line with the agreement between annotators. The proposed method can be easily added to existing backbones, with experiments on two medical datasets showed better agreement levels with annotators.

View on arXiv PDF

Similar