Masayuki Ohzeki

QUANT-PH
h-index33
28papers
474citations
Novelty42%
AI Score52

28 Papers

LGSep 6, 2023
Random Postprocessing for Combinatorial Bayesian Optimization

Keisuke Morita, Yoshihiko Nishikawa, Masayuki Ohzeki

Model-based sequential approaches to discrete "black-box" optimization, including Bayesian optimization techniques, often access the same points multiple times for a given objective function in interest, resulting in many steps to find the global optimum. Here, we numerically study the effect of a postprocessing method on Bayesian optimization that strictly prohibits duplicated samples in the dataset. We find the postprocessing method significantly reduces the number of sequential steps to find the global optimum, especially when the acquisition function is of maximum a posterior estimation. Our results provide a simple but general strategy to solve the slow convergence of Bayesian optimization for high-dimensional problems.

ITApr 15
Phase transition in compressed sensing using log-sum penalty and adaptive smoothing

Keisuke Morita, Federico Ricci-Tersenghi, Masayuki Ohzeki

In many real-world problems, recovering sparse signals from underdetermined linear systems remains a fundamental challenge. Although $\ell_1$ norm minimization is widely used, it suffers from estimation bias that prevents it from reaching the Bayes-optimal reconstruction limit. Nonconvex alternatives, such as the log-sum penalty, have been proposed to promote stronger sparsity. However, maintaining their algorithmic stability is challenging. To address this challenge, we introduce an adaptive smoothing strategy within an approximate message passing framework to mitigate algorithmic instability. Furthermore, we evaluate the typical exact-recovery threshold for Gaussian measurement matrices using the replica method and state evolution. The results indicate that the adaptive method achieves exact recovery over a broader region than $\ell_1$ norm minimization, although metastable states hinder reaching the information-theoretic limit.

ITMay 11
Sparse Signal Recovery using Log-Sum Regularization and Adaptive Smoothing

Keisuke Morita, Masayuki Ohzeki

We study sparse signal recovery from noisy linear observations using nonconvex log-sum regularization. The log-sum penalty reduces the shrinkage bias of $\ell_1$ regularization and more closely approximates the $\ell_0$ regularization, but its nonconvexity can make reconstruction algorithms unstable. To mitigate this instability, we use an adaptive smoothing strategy that determines the smoothing parameter so that the scalar proximal operator remains continuous. Using this proximal operator, we formulate the approximate message passing (AMP) algorithm and derive the corresponding state evolution (SE) recursion. The fixed point of the SE recursion predicts the final mean squared error (MSE) and, in the noiseless limit, the exact-recovery phase transition. To further investigate finite-dimensional reconstruction behavior, we implement an alternating direction method of multipliers (ADMM) algorithm. In the noiseless setting, we find that the empirical success boundary of ADMM closely agrees with the SE-predicted phase transition. In the noisy setting, we observe that AMP closely follows the SE prediction, whereas ADMM qualitatively reproduces the SE-predicted dependence of the final MSE on the regularization parameter. A comparison with $\ell_1$ regularization shows that log-sum regularization is beneficial in low-density or high-measurement-rate regimes, whereas $\ell_1$ regularization remains preferable at higher densities and lower measurement rates.

QUANT-PHJan 13
Kernel Learning for Regression via Quantum Annealing Based Spectral Sampling

Yasushi Hasegawa, Masayuki Ohzeki

While quantum annealing (QA) has been developed for combinatorial optimization, practical QA devices operate at finite temperature and under noise, and their outputs can be regarded as stochastic samples close to a Gibbs--Boltzmann distribution. In this study, we propose a QA-in-the-loop kernel learning framework that integrates QA not merely as a substitute for Markov-chain Monte Carlo sampling but as a component that directly determines the learned kernel for regression. Based on Bochner's theorem, a shift-invariant kernel is represented as an expectation over a spectral distribution, and random Fourier features (RFF) approximate the kernel by sampling frequencies. We model the spectral distribution with a (multi-layer) restricted Boltzmann machine (RBM), generate discrete RBM samples using QA, and map them to continuous frequencies via a Gaussian--Bernoulli transformation. Using the resulting RFF, we construct a data-adaptive kernel and perform Nadaraya--Watson (NW) regression. Because the RFF approximation based on $\cos(\bmω^{\top}Δ\bm{x})$ can yield small negative values and cancellation across neighbors, the Nadaraya--Watson denominator $\sum_j k_{ij}$ may become close to zero. We therefore employ nonnegative squared-kernel weights $w_{ij}=k(\bm{x}_i,\bm{x}_j)^2$, which also enhances the contrast of kernel weights. The kernel parameters are trained by minimizing the leave-one-out NW mean squared error, and we additionally evaluate local linear regression with the same squared-kernel weights at inference. Experiments on multiple benchmark regression datasets demonstrate a decrease in training loss, accompanied by structural changes in the kernel matrix, and show that the learned kernel tends to improve $R^2$ and RMSE over the baseline Gaussian-kernel NW. Increasing the number of random features at inference further enhances accuracy.

SPMar 1, 2024
Spatio-temporal reconstruction of substance dynamics using compressed sensing in multi-spectral magnetic resonance spectroscopic imaging

Utako Yamamoto, Hirohiko Imai, Kei Sano et al.

The objective of our study is to observe dynamics of multiple substances in vivo with high temporal resolution from multi-spectral magnetic resonance spectroscopic imaging (MRSI) data. The multi-spectral MRSI can effectively separate spectral peaks of multiple substances and is useful to measure spatial distributions of substances. However it is difficult to measure time-varying substance distributions directly by ordinary full sampling because the measurement requires a significantly long time. In this study, we propose a novel method to reconstruct the spatio-temporal distributions of substances from randomly undersampled multi-spectral MRSI data on the basis of compressed sensing (CS) and the partially separable function model with base spectra of substances. In our method, we have employed spatio-temporal sparsity and temporal smoothness of the substance distributions as prior knowledge to perform CS. The effectiveness of our method has been evaluated using phantom data sets of glass tubes filled with glucose or lactate solution in increasing amounts over time and animal data sets of a tumor-bearing mouse to observe the metabolic dynamics involved in the Warburg effect in vivo. The reconstructed results are consistent with the expected behaviors, showing that our method can reconstruct the spatio-temporal distribution of substances with a temporal resolution of four seconds which is extremely short time scale compared with that of full sampling. Since this method utilizes only prior knowledge naturally assumed for the spatio-temporal distributions of substances and is independent of the number of the spectral and spatial dimensions or the acquisition sequence of MRSI, it is expected to contribute to revealing the underlying substance dynamics in MRSI data already acquired or to be acquired in the future.

DIS-NNApr 20, 2024
Solution space and storage capacity of fully connected two-layer neural networks with generic activation functions

Sota Nishiyama, Masayuki Ohzeki

The storage capacity of a binary classification model is the maximum number of random input-output pairs per parameter that the model can learn. It is one of the indicators of the expressive power of machine learning models and is important for comparing the performance of various models. In this study, we analyze the structure of the solution space and the storage capacity of fully connected two-layer neural networks with general activation functions using the replica method from statistical physics. Our results demonstrate that the storage capacity per parameter remains finite even with infinite width and that the weights of the network exhibit negative correlations, leading to a 'division of labor'. In addition, we find that increasing the dataset size triggers a phase transition at a certain transition point where the permutation symmetry of weights is broken, resulting in the solution space splitting into disjoint regions. We identify the dependence of this transition point and the storage capacity on the choice of activation function. These findings contribute to understanding the influence of activation functions and the number of parameters on the structure of the solution space, potentially offering insights for selecting appropriate architectures based on specific objectives.

LGJan 12, 2025
Filtering out mislabeled training instances using black-box optimization and quantum annealing

Makoto Otsuka, Kento Kodama, Keisuke Morita et al.

This study proposes an approach for removing mislabeled instances from contaminated training datasets by combining surrogate model-based black-box optimization (BBO) with postprocessing and quantum annealing. Mislabeled training instances, a common issue in real-world datasets, often degrade model generalization, necessitating robust and efficient noise-removal strategies. The proposed method evaluates filtered training subsets based on validation loss, iteratively refines loss estimates through surrogate model-based BBO with postprocessing, and leverages quantum annealing to efficiently sample diverse training subsets with low validation error. Experiments on a noisy majority bit task demonstrate the method's ability to prioritize the removal of high-risk mislabeled instances. Integrating D-Wave's clique sampler running on a physical quantum annealer achieves faster optimization and higher-quality training subsets compared to OpenJij's simulated quantum annealing sampler or Neal's simulated annealing sampler, offering a scalable framework for enhancing dataset quality. This work highlights the effectiveness of the proposed method for supervised learning tasks, with future directions including its application to unsupervised learning, real-world datasets, and large-scale implementations.

LGDec 24, 2024
Schödinger Bridge Type Diffusion Models as an Extension of Variational Autoencoders

Kentaro Kaba, Reo Shimizu, Masayuki Ohzeki et al.

Generative diffusion models use time-forward and backward stochastic differential equations to connect the data and prior distributions. While conventional diffusion models (e.g., score-based models) only learn the backward process, more flexible frameworks have been proposed to also learn the forward process by employing the Schrödinger bridge (SB). However, due to the complexity of the mathematical structure behind SB-type models, we can not easily give an intuitive understanding of their objective function. In this work, we propose a unified framework to construct diffusion models by reinterpreting the SB-type models as an extension of variational autoencoders. In this context, the data processing inequality plays a crucial role. As a result, we find that the objective function consists of the prior loss and drift matching parts.

QUANT-PHMay 20, 2024
Application of time-series quantum generative model to financial data

Shun Okumura, Masayuki Ohzeki, Masaya Abe

Despite proposing a quantum generative model for time series that successfully learns correlated series with multiple Brownian motions, the model has not been adapted and evaluated for financial problems. In this study, a time-series generative model was applied as a quantum generative model to actual financial data. Future data for two correlated time series were generated and compared with classical methods such as long short-term memory and vector autoregression. Furthermore, numerical experiments were performed to complete missing values. Based on the results, we evaluated the practical applications of the time-series quantum generation model. It was observed that fewer parameter values were required compared with the classical method. In addition, the quantum time-series generation model was feasible for both stationary and nonstationary data. These results suggest that several parameters can be applied to various types of time-series data.

LGOct 15, 2025
Performance Evaluation of Ising and QUBO Variable Encodings in Boltzmann Machine Learning

Yasushi Hasegawa, Masayuki Ohzeki

We compare Ising ({-1,+1}) and QUBO ({0,1}) encodings for Boltzmann machine learning under a controlled protocol that fixes the model, sampler, and step size. Exploiting the identity that the Fisher information matrix (FIM) equals the covariance of sufficient statistics, we visualize empirical moments from model samples and reveal systematic, representation-dependent differences. QUBO induces larger cross terms between first- and second-order statistics, creating more small-eigenvalue directions in the FIM and lowering spectral entropy. This ill-conditioning explains slower convergence under stochastic gradient descent (SGD). In contrast, natural gradient descent (NGD)-which rescales updates by the FIM metric-achieves similar convergence across encodings due to reparameterization invariance. Practically, for SGD-based training, the Ising encoding provides more isotropic curvature and faster convergence; for QUBO, centering/scaling or NGD-style preconditioning mitigates curvature pathologies. These results clarify how representation shapes information geometry and finite-time learning dynamics in Boltzmann machines and yield actionable guidelines for variable encoding and preprocessing.

QUANT-PHOct 6, 2025
Quantum generative model on bicycle-sharing system and an application

Fumio Nemoto, Nobuyuki Koike, Daichi Sato et al.

Recently, bicycle-sharing systems have been implemented in numerous cities, becoming integral to daily life. However, a prevalent issue arises when intensive commuting demand leads to bicycle shortages in specific areas and at particular times. To address this challenge, we employ a novel quantum machine learning model that analyzes time series data by fitting quantum time evolution to observed sequences. This model enables us to capture actual trends in bicycle counts at individual ports and identify correlations between different ports. Utilizing the trained model, we simulate the impact of proactively adding bicycles to high-demand ports on the overall rental number across the system. Given that the core of this method lies in a Monte Carlo simulation, it is anticipated to have a wide range of industrial applications.

QUANT-PHJan 3, 2025
Relaxation-assisted reverse annealing on nonnegative/binary matrix factorization

Renichiro Haba, Masayuki Ohzeki, Kazuyuki Tanaka

Quantum annealing has garnered significant attention as meta-heuristics inspired by quantum physics for combinatorial optimization problems. Among its many applications, nonnegative/binary matrix factorization stands out for its complexity and relevance in unsupervised machine learning. The use of reverse annealing, a derivative procedure of quantum annealing to prioritize the search in a vicinity under a given initial state, helps improve its optimization performance in matrix factorization. This study proposes an improved strategy that integrates reverse annealing with a linear programming relaxation technique. Using relaxed solutions as the initial configuration for reverse annealing, we demonstrate improvements in optimization performance comparable to the exact optimization methods. Our experiments on facial image datasets show that our method provides better convergence than known reverse annealing methods. Furthermore, we investigate the effectiveness of relaxation-based initialization methods on randomized datasets, demonstrating a relationship between the relaxed solution and the optimal solution. This research underscores the potential of combining reverse annealing and classical optimization strategies to enhance optimization performance.

DIS-NNMar 15, 2021
Assessment of image generation by quantum annealer

Takehito Sato, Masayuki Ohzeki, Kazuyuki Tanaka

Quantum annealing was originally proposed as an approach for solving combinatorial optimisation problems using quantum effects. D-Wave Systems has released a production model of quantum annealing hardware. However, the inherent noise and various environmental factors in the hardware hamper the determination of optimal solutions. In addition, the freezing effect in regions with weak quantum fluctuations generates outputs approximately following a Gibbs--Boltzmann distribution at an extremely low temperature. Thus, a quantum annealer may also serve as a fast sampler for the Ising spin-glass problem, and several studies have investigated Boltzmann machine learning using a quantum annealer. Previous developments have focused on comparing the performance in the standard distance of the resulting distributions between conventional methods in classical computers and sampling by a quantum annealer. In this study, we focused on the performance of a quantum annealer as a generative model. To evaluate its performance, we prepared a discriminator given by a neural network trained on an a priori dataset. The evaluation results show a higher performance of quantum annealing compared with the classical approach for Boltzmann machine learning.

CVFeb 24, 2021
Kernel-based framework to estimate deformations of pneumothorax lung using relative position of anatomical landmarks

Utako Yamamoto, Megumi Nakao, Masayuki Ohzeki et al.

In video-assisted thoracoscopic surgeries, successful procedures of nodule resection are highly dependent on the precise estimation of lung deformation between the inflated lung in the computed tomography (CT) images during preoperative planning and the deflated lung in the treatment views during surgery. Lungs in the pneumothorax state during surgery have a large volume change from normal lungs, making it difficult to build a mechanical model. The purpose of this study is to develop a deformation estimation method of the 3D surface of a deflated lung from a few partial observations. To estimate deformations for a largely deformed lung, a kernel regression-based solution was introduced. The proposed method used a few landmarks to capture the partial deformation between the 3D surface mesh obtained from preoperative CT and the intraoperative anatomical positions. The deformation for each vertex of the entire mesh model was estimated per-vertex as a relative position from the landmarks. The landmarks were placed in the anatomical position of the lung's outer contour. The method was applied on nine datasets of the left lungs of live Beagle dogs. Contrast-enhanced CT images of the lungs were acquired. The proposed method achieved a local positional error of vertices of 2.74 mm, Hausdorff distance of 6.11 mm, and Dice similarity coefficient of 0.94. Moreover, the proposed method could estimate lung deformations from a small number of training cases and a small observation area. This study contributes to the data-driven modeling of pneumothorax deformation of the lung.

DIS-NNJan 21, 2019
Message-passing algorithm of quantum annealing with nonstoquastic Hamiltonian

Masayuki Ohzeki

Quantum annealing (QA) is a generic method for solving optimization problems using fictitious quantum fluctuation. The current device performing QA involves controlling the transverse field; it is classically simulatable by using the standard technique for mapping the quantum spin systems to the classical ones. In this sense, the current system for QA is not powerful despite utilizing quantum fluctuation. Hence, we developed a system with a time-dependent Hamiltonian consisting of a combination of the formulated Ising model and the "driver" Hamiltonian with only quantum fluctuation. In the previous study, for a fully connected spin model, quantum fluctuation can be addressed in a relatively simple way. We proved that the fully connected antiferromagnetic interaction can be transformed into a fluctuating transverse field and is thus classically simulatable at sufficiently low temperatures. Using the fluctuating transverse field, we established several ways to simulate part of the nonstoquastic Hamiltonian on classical computers. We formulated a message-passing algorithm in the present study. This algorithm is capable of assessing the performance of QA with part of the nonstoquastic Hamiltonian having a large number of spins. In other words, we developed a different approach for simulating the nonstoquastic Hamiltonian without using the quantum Monte Carlo technique. Our results were validated by comparison to the results obtained by the replica method.

QUANT-PHDec 4, 2018
Control of automated guided vehicles without collision by quantum annealer and digital devices

Masayuki Ohzeki, Akira Miki, Masamichi J. Miyama et al.

We formulate an optimization problem to control a large number of automated guided vehicles in a plant without collision. The formulation consists of binary variables. A quadratic cost function over these variables enables us to utilize certain solvers on digital computers and recently developed purpose-specific hardware such as D-Wave 2000Q and the Fujitsu digital annealer. In the present study, we consider an actual plant in Japan, in which vehicles run, and assess efficiency of our formulation for optimizing the vehicles via several solvers. We confirm that our formulation can be a powerful approach for performing smooth control while avoiding collisions between vehicles, as compared to a conventional method. In addition, comparative experiments performed using several solvers reveal that D-Wave 2000Q can be useful as a rapid solver for generating a plan for controlling the vehicles in a short time although it deals only with a small number of vehicles, while a digital computer can rapidly solve the corresponding optimization problem even with a large number of binary variables.

DIS-NNJul 1, 2018
Optimization of neural networks via finite-value quantum fluctuations

Masayuki Ohzeki, Shuntaro Okada, Masayoshi Terabe et al.

We numerically test an optimization method for deep neural networks (DNNs) using quantum fluctuations inspired by quantum annealing. For efficient optimization, our method utilizes the quantum tunneling effect beyond the potential barriers. The path integral formulation of the DNN optimization generates an attracting force to simulate the quantum tunneling effect. In the standard quantum annealing method, the quantum fluctuations will vanish at the last stage of optimization. In this study, we propose a learning protocol that utilizes a finite value for quantum fluctuations strength to obtain higher generalization performance, which is a type of robustness. We demonstrate the performance of our method using two well-known open datasets: the MNIST dataset and the Olivetti face dataset. Although computational costs prevent us from testing our method on large datasets with high-dimensional data, results show that our method can enhance generalization performance by induction of the finite value for quantum fluctuations.

MLMar 20, 2018
Momentum-Space Renormalization Group Transformation in Bayesian Image Modeling by Gaussian Graphical Model

Kazuyuki Tanaka, Masamichi Nakamura, Shun Kataoka et al.

A new Bayesian modeling method is proposed by combining the maximization of the marginal likelihood with a momentum-space renormalization group transformation for Gaussian graphical models. Moreover, we present a scheme for computint the statistical averages of hyperparameters and mean square errors in our proposed method based on a momentumspace renormalization transformation.

STAT-MECHDec 1, 2017
Deep Neural Network Detects Quantum Phase Transition

Shunta Arai, Masayuki Ohzeki, Kazuyuki Tanaka

We detect the quantum phase transition of a quantum many-body system by mapping the observed results of the quantum state onto a neural network. In the present study, we utilized the simplest case of a quantum many-body system, namely a one-dimensional chain of Ising spins with the transverse Ising model. We prepared several spin configurations, which were obtained using repeated observations of the model for a particular strength of the transverse field, as input data for the neural network. Although the proposed method can be employed using experimental observations of quantum many-body systems, we tested our technique with spin configurations generated by a quantum Monte Carlo simulation without initial relaxation. The neural network successfully classified the strength of transverse field only from the spin configurations, leading to consistent estimations of the critical point of our model $Γ_c =J$.

CVNov 28, 2017
Deformation estimation of an elastic object by partial observation using a neural network

Utako Yamamoto, Megumi Nakao, Masayuki Ohzeki et al.

Deformation estimation of elastic object assuming an internal organ is important for the computer navigation of surgery. The aim of this study is to estimate the deformation of an entire three-dimensional elastic object using displacement information of very few observation points. A learning approach with a neural network was introduced to estimate the entire deformation of an object. We applied our method to two elastic objects; a rectangular parallelepiped model, and a human liver model reconstructed from computed tomography data. The average estimation error for the human liver model was 0.041 mm when the object was deformed up to 66.4 mm, from only around 3 % observations. These results indicate that the deformation of an entire elastic object can be estimated with an acceptable level of error from limited observations by applying a trained neural network to a new deformation.

STR-ELFeb 10, 2017
Sparse modeling approach to analytical continuation of imaginary-time quantum Monte Carlo data

Junya Otsuki, Masayuki Ohzeki, Hiroshi Shinaoka et al.

A new approach of solving the ill-conditioned inverse problem for analytical continuation is proposed. The root of the problem lies in the fact that even tiny noise of imaginary-time input data has a serious impact on the inferred real-frequency spectra. By means of a modern regularization technique, we eliminate redundant degrees of freedom that essentially carry the noise, leaving only relevant information unaffected by the noise. The resultant spectrum is represented with minimal bases and thus a stable analytical continuation is achieved. This framework further provides a tool for analyzing to what extent the Monte Carlo data need to be accurate to resolve details of an expected spectral function.

STR-ELFeb 10, 2017
Compressing Green's function using intermediate representation between imaginary-time and real-frequency domains

Hiroshi Shinaoka, Junya Otsuki, Masayuki Ohzeki et al.

New model-independent compact representations of imaginary-time data are presented in terms of the intermediate representation (IR) of analytical continuation. This is motivated by a recent numerical finding by the authors [J. Otsuki et al., arXiv:1702.03056]. We demonstrate the efficiency of the IR through continuous-time quantum Monte Carlo calculations of an Anderson impurity model. We find that the IR yields a significantly compact form of various types of correlation functions. The present framework will provide general ways to boost the power of cutting-edge diagrammatic/quantum Monte Carlo treatments of many-body systems.

QUANT-PHDec 14, 2016
Quantum Monte Carlo simulation of a particular class of non-stoquastic Hamiltonians in quantum annealing

Masayuki Ohzeki

Quantum annealing is a generic solver of the optimization problem that uses fictitious quantum fluctuation. Its simulation in classical computing is often performed using the quantum Monte Carlo simulation via the Suzuki--Trotter decomposition. However, the negative sign problem sometimes emerges in the simulation of quantum annealing with an elaborate driver Hamiltonian, since it belongs to a class of non-stoquastic Hamiltonians. In the present study, we propose an alternative way to avoid the negative sign problem involved in a particular class of the non-stoquastic Hamiltonians. To check the validity of the method, we demonstrate our method by applying it to a simple problem that includes the anti-ferromagnetic XX interaction, which is a typical instance of the non-stoquastic Hamiltonians.

MLNov 19, 2015
Stochastic gradient method with accelerated stochastic dynamics

Masayuki Ohzeki

In this paper, we propose a novel technique to implement stochastic gradient methods, which are beneficial for learning from large datasets, through accelerated stochastic dynamics. A stochastic gradient method is based on mini-batch learning for reducing the computational cost when the amount of data is large. The stochasticity of the gradient can be mitigated by the injection of Gaussian noise, which yields the stochastic Langevin gradient method; this method can be used for Bayesian posterior sampling. However, the performance of the stochastic Langevin gradient method depends on the mixing rate of the stochastic dynamics. In this study, we propose violating the detailed balance condition to enhance the mixing rate. Recent studies have revealed that violating the detailed balance condition accelerates the convergence to a stationary state and reduces the correlation time between the samplings. We implement this violation of the detailed balance condition in the stochastic gradient Langevin method and test our method for a simple model to demonstrate its performance.

MLMar 11, 2015
L_1-regularized Boltzmann machine learning using majorizer minimization

Masayuki Ohzeki

We propose an inference method to estimate sparse interactions and biases according to Boltzmann machine learning. The basis of this method is $L_1$ regularization, which is often used in compressed sensing, a technique for reconstructing sparse input signals from undersampled outputs. $L_1$ regularization impedes the simple application of the gradient method, which optimizes the cost function that leads to accurate estimations, owing to the cost function's lack of smoothness. In this study, we utilize the majorizer minimization method, which is a well-known technique implemented in optimization problems, to avoid the non-smoothness of the cost function. By using the majorizer minimization method, we elucidate essentially relevant biases and interactions from given data with seemingly strongly-correlated components.

MLJan 19, 2015
Statistical-mechanical analysis of pre-training and fine tuning in deep learning

Masayuki Ohzeki

In this paper, we present a statistical-mechanical analysis of deep learning. We elucidate some of the essential components of deep learning---pre-training by unsupervised learning and fine tuning by supervised learning. We formulate the extraction of features from the training data as a margin criterion in a high-dimensional feature-vector space. The self-organized classifier is then supplied with small amounts of labelled data, as in deep learning. Although we employ a simple single-layer perceptron model, rather than directly analyzing a multi-layer neural network, we find a nontrivial phase transition that is dependent on the number of unlabelled data in the generalization error of the resultant classifier. In this sense, we evaluate the efficacy of the unsupervised learning component of deep learning. The analysis is performed by the replica method, which is a sophisticated tool in statistical mechanics. We validate our result in the manner of deep learning, using a simple iterative algorithm to learn the weight vector on the basis of belief propagation.

CVJan 5, 2015
Inverse Renormalization Group Transformation in Bayesian Image Segmentations

Kazuyuki Tanaka, Shun Kataoka, Muneki Yasuda et al.

A new Bayesian image segmentation algorithm is proposed by combining a loopy belief propagation with an inverse real space renormalization group transformation to reduce the computational time. In results of our experiment, we observe that the proposed method can reduce the computational time to less than one-tenth of that taken by conventional Bayesian approaches.

MLOct 14, 2014
Detection of cheating by decimation algorithm

Shogo Yamanaka, Masayuki Ohzeki, Aurelien Decelle

We expand the item response theory to study the case of "cheating students" for a set of exams, trying to detect them by applying a greedy algorithm of inference. This extended model is closely related to the Boltzmann machine learning. In this paper we aim to infer the correct biases and interactions of our model by considering a relatively small number of sets of training data. Nevertheless, the greedy algorithm that we employed in the present study exhibits good performance with a few number of training data. The key point is the sparseness of the interactions in our problem in the context of the Boltzmann machine learning: the existence of cheating students is expected to be very rare (possibly even in real world). We compare a standard approach to infer the sparse interactions in the Boltzmann machine learning to our greedy algorithm and we find the latter to be superior in several aspects.