MLApr 8, 2022
Free Energy Evaluation Using Marginalized Annealed Importance SamplingMuneki Yasuda, Chako Takahashi
The evaluation of the free energy of a stochastic model is considered a significant issue in various fields of physics and machine learning. However, the exact free energy evaluation is computationally infeasible because the free energy expression includes an intractable partition function. Annealed importance sampling (AIS) is a type of importance sampling based on the Markov chain Monte Carlo method that is similar to a simulated annealing and can effectively approximate the free energy. This study proposes an AIS-based approach, which is referred to as marginalized AIS (mAIS). The statistical efficiency of mAIS is investigated in detail based on theoretical and numerical perspectives. Based on the investigation, it is proved that mAIS is more effective than AIS under a certain condition.
LGOct 27, 2022
Multi-layered Discriminative Restricted Boltzmann Machine with Untrained Probabilistic LayerYuri Kanno, Muneki Yasuda
An extreme learning machine (ELM) is a three-layered feed-forward neural network having untrained parameters, which are randomly determined before training. Inspired by the idea of ELM, a probabilistic untrained layer called a probabilistic-ELM (PELM) layer is proposed, and it is combined with a discriminative restricted Boltzmann machine (DRBM), which is a probabilistic three-layered neural network for solving classification problems. The proposed model is obtained by stacking DRBM on the PELM layer. The resultant model (i.e., multi-layered DRBM (MDRBM)) forms a probabilistic four-layered neural network. In MDRBM, the parameters in the PELM layer can be determined using Gaussian-Bernoulli restricted Boltzmann machine. Owing to the PELM layer, MDRBM obtains a strong immunity against noise in inputs, which is one of the most important advantages of MDRBM. Numerical experiments using some benchmark datasets, MNIST, Fashion-MNIST, Urban Land Cover, and CIFAR-10, demonstrate that MDRBM is superior to other existing models, particularly, in terms of the noise-robustness property (or, in other words, the generalization property).
COApr 7, 2022
Composite Spatial Monte Carlo Integration Based on Generalized Least SquaresKaiji Sekimoto, Muneki Yasuda
Although evaluation of the expectations on the Ising model is essential in various applications, it is mostly infeasible because of intractable multiple summations. Spatial Monte Carlo integration (SMCI) is a sampling-based approximation. It can provide high-accuracy estimations for such intractable expectations. To evaluate the expectation of a function of variables in a specific region (called target region), SMCI considers a larger region containing the target region (called sum region). In SMCI, the multiple summation for the variables in the sum region is precisely executed, and that in the outer region is evaluated by the sampling approximation such as the standard Monte Carlo integration. It is guaranteed that the accuracy of the SMCI estimator improves monotonically as the size of the sum region increases. However, a haphazard expansion of the sum region could cause a combinatorial explosion. Therefore, we hope to improve the accuracy without such an expansion. In this paper, based on the theory of generalized least squares (GLS), a new effective method is proposed by combining multiple SMCI estimators. The validity of the proposed method is demonstrated theoretically and numerically. The results indicate that the proposed method can be effective in the inverse Ising problem (or Boltzmann machine learning).
MLSep 12, 2024
Dataset-Free Weight-Initialization on Restricted Boltzmann MachineMuneki Yasuda, Ryosuke Maeno, Chako Takahashi
In feed-forward neural networks, dataset-free weight-initialization methods such as LeCun, Xavier (or Glorot), and He initializations have been developed. These methods randomly determine the initial values of weight parameters based on specific distributions (e.g., Gaussian or uniform distributions) without using training datasets. To the best of the authors' knowledge, such a dataset-free weight-initialization method is yet to be developed for restricted Boltzmann machines (RBMs), which are probabilistic neural networks consisting of two layers. In this study, we derive a dataset-free weight-initialization method for Bernoulli--Bernoulli RBMs based on statistical mechanical analysis. In the proposed weight-initialization method, the weight parameters are drawn from a Gaussian distribution with zero mean. The standard deviation of the Gaussian distribution is optimized based on our hypothesis that a standard deviation providing a larger layer correlation (LC) between the two layers improves the learning efficiency. The expression of the LC is derived based on a statistical mechanical analysis. The optimal value of the standard deviation corresponds to the maximum point of the LC. The proposed weight-initialization method is identical to Xavier initialization in a specific case (i.e., when the sizes of the two layers are the same, the random variables of the layers are $\{-1,1\}$-binary, and all bias parameters are zero). The validity of the proposed weight-initialization method is demonstrated in numerical experiments using a toy and real-world datasets.
MLMar 12
EB-RANSAC: Random Sample Consensus based on Energy-Based ModelMuneki Yasuda, Nao Watanabe, Kaiji Sekimoto
Random sample consensus (RANSAC), which is based on a repetitive sampling from a given dataset, is one of the most popular robust estimation methods. In this study, an energy-based model (EBM) for robust estimation that has a similar scheme to RANSAC, energy-based RANSAC (EB-RANSAC), is proposed. EB-RANSAC is applicable to a wide range of estimation problems similar to RANSAC. However, unlike RANSAC, EB-RANSAC does not require a troublesome sampling procedure and has only one hyperparameter. The effectiveness of EB-RANSAC is numerically demonstrated in two applications: a linear regression and maximum likelihood estimation.
MLApr 8, 2025
Effective Method for Inverse Ising Problem under Missing Observations in Restricted Boltzmann MachinesKaiji Sekimoto, Muneki Yasuda
Restricted Boltzmann machines (RBMs) are energy-based models analogous to the Ising model and are widely applied in statistical machine learning. The standard inverse Ising problem with a complete dataset requires computing both data and model expectations and is computationally challenging because model expectations have a combinatorial explosion. Furthermore, in many applications, the available datasets are partially incomplete, making it difficult to compute even data expectations. In this study, we propose a approximation framework for these expectations in the practical inverse Ising problems that integrates mean-field approximation or persistent contrastive divergence to generate refined initial points and spatial Monte Carlo integration to enhance estimator accuracy. We demonstrate that the proposed method effectively and accurately tunes the model parameters in comparison to the conventional method.
LGMar 19, 2024
Improving Interpretability of Scores in Anomaly Detection Based on Gaussian-Bernoulli Restricted Boltzmann MachineKaiji Sekimoto, Muneki Yasuda
Gaussian-Bernoulli restricted Boltzmann machines (GBRBMs) are often used for semi-supervised anomaly detection, where they are trained using only normal data points. In GBRBM-based anomaly detection, normal and anomalous data are classified based on a score that is identical to an energy function of the marginal GBRBM. However, the classification threshold is difficult to set to an appropriate value, as this score cannot be interpreted. In this study, we propose a measure that improves score's interpretability based on its cumulative distribution, and establish a guideline for setting the threshold using the interpretable measure. The results of numerical experiments show that the guideline is reasonable when setting the threshold solely using normal data points. Moreover, because identifying the measure involves computationally infeasible evaluation of the minimum score value, we also propose an evaluation method for the minimum score based on simulated annealing, which is widely used for optimization problems. The proposed evaluation method was also validated using numerical experiments.
MLDec 21, 2020
Spatial Monte Carlo Integration with Annealed Importance SamplingMuneki Yasuda, Kaiji Sekimoto
Evaluating expectations on an Ising model (or Boltzmann machine) is essential for various applications, including statistical machine learning. However, in general, the evaluation is computationally difficult because it involves intractable multiple summations or integrations; therefore, it requires approximation. Monte Carlo integration (MCI) is a well-known approximation method; a more effective MCI-like approximation method was proposed recently, called spatial Monte Carlo integration (SMCI). However, the estimations obtained using SMCI (and MCI) exhibit a low accuracy in Ising models under a low temperature owing to degradation of the sampling quality. Annealed importance sampling (AIS) is a type of importance sampling based on Markov chain Monte Carlo methods that can suppress performance degradation in low-temperature regions with the force of importance weights. In this study, a new method is proposed to evaluate the expectations on Ising models combining AIS and SMCI. The proposed method performs efficiently in both high- and low-temperature regions, which is demonstrated theoretically and numerically.
MLSep 4, 2020
A Generalization of Spatial Monte Carlo IntegrationMuneki Yasuda, Kei Uchizawa
Spatial Monte Carlo integration (SMCI) is an extension of standard Monte Carlo integration and can approximate expectations on Markov random fields with high accuracy. SMCI was applied to pairwise Boltzmann machine (PBM) learning, with superior results to those from some existing methods. The approximation level of SMCI can be changed, and it was proved that a higher-order approximation of SMCI is statistically more accurate than a lower-order approximation. However, SMCI as proposed in the previous studies suffers from a limitation that prevents the application of a higher-order method to dense systems. This study makes two different contributions as follows. A generalization of SMCI (called generalized SMCI (GSMCI)) is proposed, which allows relaxation of the above-mentioned limitation; moreover, a statistical accuracy bound of GSMCI is proved. This is the first contribution of this study. A new PBM learning method based on SMCI is proposed, which is obtained by combining SMCI and the persistent contrastive divergence. The proposed learning method greatly improves the accuracy of learning. This is the second contribution of this study.
MLJan 6, 2020
Consistent Batch Normalization for Weighted Loss in Imbalanced-Data EnvironmentMuneki Yasuda, Yeo Xian En, Seishirou Ueno
In this study, classification problems based on feedforward neural networks in a data-imbalanced environment are considered. Learning from an imbalanced dataset is one of the most important practical problems in the field of machine learning. A weighted loss function (WLF) based on a cost-sensitive approach is a well-known and effective method for imbalanced datasets. A combination of WLF and batch normalization (BN) is considered in this study. BN is considered as a powerful standard technique in the recent developments in deep learning. A simple combination of both methods leads to a size-inconsistency problem due to a mismatch between the interpretations of the effective size of the dataset in both methods. A simple modification to BN, called weighted BN (WBN), is proposed to correct the size mismatch. The idea of WBN is simple and natural. The proposed method in a data-imbalanced environment is validated using numerical experiments.
MLNov 25, 2019
Improvement of Batch Normalization in Imbalanced DataMuneki Yasuda, Seishirou Ueno
In this study, we consider classification problems based on neural networks in data-imbalanced environment. Learning from an imbalanced data set is one of the most important and practical problems in the field of machine learning. A weighted loss function based on cost-sensitive approach is a well-known effective method for imbalanced data sets. We consider a combination of weighted loss function and batch normalization (BN) in this study. BN is a powerful standard technique in the recent developments in deep learning. A simple combination of both methods leads to a size-mismatch problem due to a mismatch between interpretations of effective size of data set in both methods. We propose a simple modification to BN to correct the size-mismatch and demonstrate that this modified BN is effective in data-imbalanced environment.
MLJun 14, 2019
Empirical Bayes Method for Boltzmann MachinesMuneki Yasuda, Tomoyuki Obuchi
In this study, we consider an empirical Bayes method for Boltzmann machines and propose an algorithm for it. The empirical Bayes method allows estimation of the values of the hyperparameters of the Boltzmann machine by maximizing a specific likelihood function referred to as the empirical Bayes likelihood function in this study. However, the maximization is computationally hard because the empirical Bayes likelihood function involves intractable integrations of the partition function. The proposed algorithm avoids this computational problem by using the replica method and the Plefka expansion. Our method does not require any iterative procedures and is quite simple and fast, though it introduces a bias to the estimate, which exhibits an unnatural behavior with respect to the size of the dataset. This peculiar behavior is supposed to be due to the approximate treatment by the Plefka expansion. A possible extension to overcome this behavior is also discussed.
MLNov 30, 2018
Restricted Boltzmann Machine with Multivalued Hidden Variables: a model suppressing over-fittingYuuki Yokoyama, Tomu Katsumata, Muneki Yasuda
Generalization is one of the most important issues in machine learning problems. In this study, we consider generalization in restricted Boltzmann machines (RBMs). We propose an RBM with multivalued hidden variables, which is a simple extension of conventional RBMs. We demonstrate that the proposed model is better than the conventional model via numerical experiments for contrastive divergence learning with artificial data and a classification problem with MNIST.
MLMar 20, 2018
Momentum-Space Renormalization Group Transformation in Bayesian Image Modeling by Gaussian Graphical ModelKazuyuki Tanaka, Masamichi Nakamura, Shun Kataoka et al.
A new Bayesian modeling method is proposed by combining the maximization of the marginal likelihood with a momentum-space renormalization group transformation for Gaussian graphical models. Moreover, we present a scheme for computint the statistical averages of hyperparameters and mean square errors in our proposed method based on a momentumspace renormalization transformation.
STAT-MECHDec 1, 2017
Susceptibility Propagation by Using Diagonal ConsistencyMuneki Yasuda, Kazuyuki Tanaka
A susceptibility propagation that is constructed by combining a belief propagation and a linear response method is used for approximate computation for Markov random fields. Herein, we formulate a new, improved susceptibility propagation by using the concept of a diagonal matching method that is based on mean-field approaches to inverse Ising problems. The proposed susceptibility propagation is robust for various network structures, and it is reduced to the ordinary susceptibility propagation and to the adaptive Thouless-Anderson-Palmer equation in special cases.
MLOct 20, 2017
Linear-Time Algorithm in Bayesian Image Denoising based on Gaussian Markov Random FieldMuneki Yasuda, Junpei Watanabe, Shun Kataoka et al.
In this paper, we consider Bayesian image denoising based on a Gaussian Markov random field (GMRF) model, for which we propose an new algorithm. Our method can solve Bayesian image denoising problems, including hyperparameter estimation, in $O(n)$-time, where $n$ is the number of pixels in a given image. From the perspective of the order of the computational time, this is a state-of-the-art algorithm for the present problem setting. Moreover, the results of our numerical experiments we show our method is in fact effective in practice.
MLMar 28, 2017
Solving Non-parametric Inverse Problem in Continuous Markov Random Field using Loopy Belief PropagationMuneki Yasuda, Shun Kataoka
In this paper, we address the inverse problem, or the statistical machine learning problem, in Markov random fields with a non-parametric pair-wise energy function with continuous variables. The inverse problem is formulated by maximum likelihood estimation. The exact treatment of maximum likelihood estimation is intractable because of two problems: (1) it includes the evaluation of the partition function and (2) it is formulated in the form of functional optimization. We avoid Problem (1) by using Bethe approximation. Bethe approximation is an approximation technique equivalent to the loopy belief propagation. Problem (2) can be solved by using orthonormal function expansion. Orthonormal function expansion can reduce a functional optimization problem to a function optimization problem. Our method can provide an analytic form of the solution of the inverse problem within the framework of Bethe approximation.
SIJul 21, 2016
Community Detection Algorithm Combining Stochastic Block Model and Attribute Data ClusteringShun Kataoka, Takuto Kobayashi, Muneki Yasuda et al.
We propose a new algorithm to detect the community structure in a network that utilizes both the network structure and vertex attribute data. Suppose we have the network structure together with the vertex attribute data, that is, the information assigned to each vertex associated with the community to which it belongs. The problem addressed this paper is the detection of the community structure from the information of both the network structure and the vertex attribute data. Our approach is based on the Bayesian approach that models the posterior probability distribution of the community labels. The detection of the community structure in our method is achieved by using belief propagation and an EM algorithm. We numerically verified the performance of our method using computer-generated networks and real-world networks.
MLMar 8, 2016
Effective Mean-Field Inference Method for Nonnegative Boltzmann MachinesMuneki Yasuda
Nonnegative Boltzmann machines (NNBMs) are recurrent probabilistic neural network models that can describe multi-modal nonnegative data. NNBMs form rectified Gaussian distributions that appear in biological neural network models, positive matrix factorization, nonnegative matrix factorization, and so on. In this paper, an effective inference method for NNBMs is proposed that uses the mean-field method, referred to as the Thouless--Anderson--Palmer equation, and the diagonal consistency method, which was recently proposed.
MLDec 3, 2015
Mean-Field Inference in Gaussian Restricted Boltzmann MachineChako Takahashi, Muneki Yasuda
A Gaussian restricted Boltzmann machine (GRBM) is a Boltzmann machine defined on a bipartite graph and is an extension of usual restricted Boltzmann machines. A GRBM consists of two different layers: a visible layer composed of continuous visible variables and a hidden layer composed of discrete hidden variables. In this paper, we derive two different inference algorithms for GRBMs based on the naive mean-field approximation (NMFA). One is an inference algorithm for whole variables in a GRBM, and the other is an inference algorithm for partial variables in a GBRBM. We compare the two methods analytically and numerically and show that the latter method is better.
MLMar 16, 2015
Statistical Analysis of Loopy Belief Propagation in Random FieldsMuneki Yasuda, Shun Kataoka, Kazuyuki Tanaka
Loopy belief propagation (LBP), which is equivalent to the Bethe approximation in statistical mechanics, is a message-passing-type inference method that is widely used to analyze systems based on Markov random fields (MRFs). In this paper, we propose a message-passing-type method to analytically evaluate the quenched average of LBP in random fields by using the replica cluster variation method. The proposed analytical method is applicable to general pair-wise MRFs with random fields whose distributions differ from each other and can give the quenched averages of the Bethe free energies over random fields, which are consistent with numerical results. The order of its computational cost is equivalent to that of standard LBP. In the latter part of this paper, we describe the application of the proposed method to Bayesian image restoration, in which we observed that our theoretical results are in good agreement with the numerical results for natural images.
CVJan 5, 2015
Inverse Renormalization Group Transformation in Bayesian Image SegmentationsKazuyuki Tanaka, Shun Kataoka, Muneki Yasuda et al.
A new Bayesian image segmentation algorithm is proposed by combining a loopy belief propagation with an inverse real space renormalization group transformation to reduce the computational time. In results of our experiment, we observe that the proposed method can reduce the computational time to less than one-tenth of that taken by conventional Bayesian approaches.
MLDec 16, 2014
Boltzmann-Machine Learning of Prior Distributions of Binarized Natural ImagesTomoyuki Obuchi, Hirokazu Koma, Muneki Yasuda
Prior distributions of binarized natural images are learned by using a Boltzmann machine. According the results of this study, there emerges a structure with two sublattices in the interactions, and the nearest-neighbor and next-nearest-neighbor interactions correspondingly take two discriminative values, which reflects the individual characteristics of the three sets of pictures that we process. Meanwhile, in a longer spatial scale, a longer-range, although still rapidly decaying, ferromagnetic interaction commonly appears in all cases. The characteristic length scale of the interactions is universally up to approximately four lattice spacings $ξ\approx 4$. These results are derived by using the mean-field method, which effectively reduces the computational time required in a Boltzmann machine. An improved mean-field method called the Bethe approximation also gives the same results, as well as the Monte Carlo method does for small size images. These reinforce the validity of our analysis and findings. Relations to criticality, frustration, and simple-cell receptive fields are also discussed.
LGJun 24, 2014
Composite Likelihood Estimation for Restricted Boltzmann machinesMuneki Yasuda, Shun Kataoka, Yuji Waizumi et al.
Learning the parameters of graphical models using the maximum likelihood estimation is generally hard which requires an approximation. Maximum composite likelihood estimations are statistical approximations of the maximum likelihood estimation which are higher-order generalizations of the maximum pseudo-likelihood estimation. In this paper, we propose a composite likelihood method and investigate its property. Furthermore, we apply our composite likelihood method to restricted Boltzmann machines.
MLApr 23, 2014
Bayesian Reconstruction of Missing ObservationsShun Kataoka, Muneki Yasuda, Kazuyuki Tanaka
We focus on an interpolation method referred to Bayesian reconstruction in this paper. Whereas in standard interpolation methods missing data are interpolated deterministically, in Bayesian reconstruction, missing data are interpolated probabilistically using a Bayesian treatment. In this paper, we address the framework of Bayesian reconstruction and its application to the traffic data reconstruction problem in the field of traffic engineering. In the latter part of this paper, we describe the evaluation of the statistical performance of our Bayesian traffic reconstruction model using a statistical mechanical approach and clarify its statistical behavior.
CVApr 11, 2014
Bayesian image segmentations by Potts prior and loopy belief propagationKazuyuki Tanaka, Shun Kataoka, Muneki Yasuda et al.
This paper presents a Bayesian image segmentation model based on Potts prior and loopy belief propagation. The proposed Bayesian model involves several terms, including the pairwise interactions of Potts models, and the average vectors and covariant matrices of Gauss distributions in color image modeling. These terms are often referred to as hyperparameters in statistical machine learning theory. In order to determine these hyperparameters, we propose a new scheme for hyperparameter estimation based on conditional maximization of entropy in the Potts prior. The algorithm is given based on loopy belief propagation. In addition, we compare our conditional maximum entropy framework with the conventional maximum likelihood framework, and also clarify how the first order phase transitions in LBP's for Potts models influence our hyperparameter estimation procedures.
MLJun 27, 2013
Traffic data reconstruction based on Markov random field modelingShun Kataoka, Muneki Yasuda, Cyril Furtlehner et al.
We consider the traffic data reconstruction problem. Suppose we have the traffic data of an entire city that are incomplete because some road data are unobserved. The problem is to reconstruct the unobserved parts of the data. In this paper, we propose a new method to reconstruct incomplete traffic data collected from various traffic sensors. Our approach is based on Markov random field modeling of road traffic. The reconstruction is achieved by using mean-field method and a machine learning method. We numerically verify the performance of our method using realistic simulated traffic data for the real road network of Sendai, Japan.