ML LGJan 23, 2019

Hamiltonian Monte-Carlo for Orthogonal Matrices

arXiv:1901.08045v11.2

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient Bayesian inference for models using orthogonal matrices, such as in neural networks and low-rank factorization, offering an incremental improvement over existing sampling techniques.

The paper tackles the problem of sampling from posterior distributions in Bayesian models with orthogonal matrix parameters, proposing a new Hamiltonian Monte-Carlo scheme that avoids exact geodesic computations. The method is shown to be comparable or faster in iteration time and more sample-efficient than conventional HMC and Geodesic Monte-Carlo in experiments.

We consider the problem of sampling from posterior distributions for Bayesian models where some parameters are restricted to be orthogonal matrices. Such matrices are sometimes used in neural networks models for reasons of regularization and stabilization of training procedures, and also can parameterize matrices of bounded rank, positive-definite matrices and others. In \citet{byrne2013geodesic} authors have already considered sampling from distributions over manifolds using exact geodesic flows in a scheme similar to Hamiltonian Monte Carlo (HMC). We propose new sampling scheme for a set of orthogonal matrices that is based on the same approach, uses ideas of Riemannian optimization and does not require exact computation of geodesic flows. The method is theoretically justified by proof of symplecticity for the proposed iteration. In experiments we show that the new scheme is comparable or faster in time per iteration and more sample-efficient comparing to conventional HMC with explicit orthogonal parameterization and Geodesic Monte-Carlo. We also provide promising results of Bayesian ensembling for orthogonal neural networks and low-rank matrix factorization.

View on arXiv PDF

Similar