ML IR LGFeb 19, 2014

Retrieval of Experiments by Efficient Estimation of Marginal Likelihood

Sohan Seth, John Shawe-Taylor, Samuel Kaski

arXiv:1402.4653v1

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient experiment retrieval for researchers, but it is incremental as it builds on existing probabilistic modeling approaches.

The paper tackles the problem of retrieving relevant experiments by using probabilistic models to measure similarity based on measurements, rather than just annotations, and proposes strategies to select informative posterior samples to reduce computational load while maintaining performance.

We study the task of retrieving relevant experiments given a query experiment. By experiment, we mean a collection of measurements from a set of `covariates' and the associated `outcomes'. While similar experiments can be retrieved by comparing available `annotations', this approach ignores the valuable information available in the measurements themselves. To incorporate this information in the retrieval task, we suggest employing a retrieval metric that utilizes probabilistic models learned from the measurements. We argue that such a metric is a sensible measure of similarity between two experiments since it permits inclusion of experiment-specific prior knowledge. However, accurate models are often not analytical, and one must resort to storing posterior samples which demands considerable resources. Therefore, we study strategies to select informative posterior samples to reduce the computational load while maintaining the retrieval performance. We demonstrate the efficacy of our approach on simulated data with simple linear regression as the models, and real world datasets.

View on arXiv PDF

Similar