LG MLOct 20, 2019

Learning from both experts and data

Rémi Besson, Erwan Le Pennec, Stéphanie Allassonnière

arXiv:1910.09043v21 citations

Originality Incremental advance

AI Analysis

This solves the data scarcity issue in applications requiring probability estimation, offering an incremental improvement over existing hybrid approaches.

The paper addresses the problem of inferring a discrete probability distribution when data is scarce by combining expert knowledge and empirical data, proposing a method that optimally weights expert input against data and showing it outperforms using either alone with a constant efficiency gain.

In this work we study the problem of inferring a discrete probability distribution using both expert knowledge and empirical data. This is an important issue for many applications where the scarcity of data prevents a purely empirical approach. In this context, it is common to rely first on an initial domain knowledge a priori before proceeding to an online data acquisition. We are particularly interested in the intermediate regime where we do not have enough data to do without the initial expert a priori of the experts, but enough to correct it if necessary. We present here a novel way to tackle this issue with a method providing an objective way to choose the weight to be given to experts compared to data. We show, both empirically and theoretically, that our proposed estimator is always more efficient than the best of the two models (expert or data) within a constant.

View on arXiv PDF

Similar