MLLGMay 7, 2020

Relevance Vector Machine with Weakly Informative Hyperprior and Extended Predictive Information Criterion

arXiv:2005.03419v1
AI Analysis

This work addresses model selection and overfitting issues in kernel-based regression for non-homogeneous data, representing an incremental improvement in Bayesian machine learning methods.

The authors tackled the problem of overfitting in multiple kernel relevance vector regression by proposing a weakly informative inverse gamma hyperprior and an extended predictive information criterion, achieving improved predictive accuracy on non-homogeneous data.

In the variational relevance vector machine, the gamma distribution is representative as a hyperprior over the noise precision of automatic relevance determination prior. Instead of the gamma hyperprior, we propose to use the inverse gamma hyperprior with a shape parameter close to zero and a scale parameter not necessary close to zero. This hyperprior is associated with the concept of a weakly informative prior. The effect of this hyperprior is investigated through regression to non-homogeneous data. Because it is difficult to capture the structure of such data with a single kernel function, we apply the multiple kernel method, in which multiple kernel functions with different widths are arranged for input data. We confirm that the degrees of freedom in a model is controlled by adjusting the scale parameter and keeping the shape parameter close to zero. A candidate for selecting the scale parameter is the predictive information criterion. However the estimated model using this criterion seems to cause over-fitting. This is because the multiple kernel method makes the model a situation where the dimension of the model is larger than the data size. To select an appropriate scale parameter even in such a situation, we also propose an extended prediction information criterion. It is confirmed that a multiple kernel relevance vector regression model with good predictive accuracy can be obtained by selecting the scale parameter minimizing extended prediction information criterion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes