LGSep 20, 2021

Assisted Learning for Organizations with Limited Imbalanced Data

Cheng Chen, Jiaying Zhou, Jie Ding, Yi Zhou

arXiv:2109.09307v43.11 citations

Originality Incremental advance

AI Analysis

This addresses data-sharing constraints for organizations with computational resources but limited data, though it appears incremental as it builds on distributed learning concepts.

The paper tackles the problem of organizations with limited and imbalanced data by developing an assisted learning framework that allows them to purchase assistance from an external provider, achieving near-oracle performance with only occasional information sharing.

In the era of big data, many big organizations are integrating machine learning into their work pipelines to facilitate data analysis. However, the performance of their trained models is often restricted by limited and imbalanced data available to them. In this work, we develop an assisted learning framework for assisting organizations to improve their learning performance. The organizations have sufficient computation resources but are subject to stringent data-sharing and collaboration policies. Their limited imbalanced data often cause biased inference and sub-optimal decision-making. In assisted learning, an organizational learner purchases assistance service from an external service provider and aims to enhance its model performance within only a few assistance rounds. We develop effective stochastic training algorithms for both assisted deep learning and assisted reinforcement learning. Different from existing distributed algorithms that need to frequently transmit gradients or models, our framework allows the learner to only occasionally share information with the service provider, but still obtain a model that achieves near-oracle performance as if all the data were centralized.

View on arXiv PDF

Similar