Label Propagation for Learning with Label Proportions
This addresses the challenge of expensive individual labeling in domains like healthcare, where patients can provide only high-level summaries, but the work appears incremental as it builds on existing LLP methods.
The paper tackles the problem of Learning with Label Proportions (LLP), where true labels must be recovered from aggregated bag-level data, by presenting a graph-based algorithm that encourages local smoothness and exploits global structure while preserving bag mass.
Learning with Label Proportions (LLP) is the problem of recovering the underlying true labels given a dataset when the data is presented in the form of bags. This paradigm is particularly suitable in contexts where providing individual labels is expensive and label aggregates are more easily obtained. In the healthcare domain, it is a burden for a patient to keep a detailed diary of their daily routines, but often they will be amenable to provide higher level summaries of daily behavior. We present a novel and efficient graph-based algorithm that encourages local smoothness and exploits the global structure of the data, while preserving the `mass' of each bag.