Tsuyoshi Okubo

LG
h-index12
3papers
16citations
Novelty65%
AI Score31

3 Papers

MLJul 5, 2023
Universal Scaling Laws of Absorbing Phase Transitions in Artificial Deep Neural Networks

Keiichi Tamai, Tsuyoshi Okubo, Truong Vinh Truong Duy et al.

We demonstrate that conventional artificial deep neural networks operating near the phase boundary of the signal propagation dynamics, also known as the edge of chaos, exhibit universal scaling laws of absorbing phase transitions in non-equilibrium statistical mechanics. We exploit the fully deterministic nature of the propagation dynamics to elucidate an analogy between a signal collapse in the neural networks and an absorbing state (a state that the system can enter but cannot escape from). Our numerical results indicate that the multilayer perceptrons and the convolutional neural networks belong to the mean-field and the directed percolation universality classes, respectively. Also, the finite-size scaling is successfully applied, suggesting a potential connection to the depth-width trade-off in deep learning. Furthermore, our analysis of the training dynamics under the gradient descent reveals that hyperparameter tuning to the phase boundary is necessary but insufficient for achieving optimal generalization in deep networks. Remarkably, nonuniversal metric factors associated with the scaling laws are shown to play a significant role in concretizing the above observations. These findings highlight the usefulness of the notion of criticality for analyzing the behavior of artificial deep neural networks and offer new insights toward a unified understanding of the essential relationship between criticality and intelligence.

LGAug 20, 2024
Tensor tree learns hidden relational structures in data to construct generative models

Kenji Harada, Tsuyoshi Okubo, Naoki Kawashima

Based on the tensor tree network with the Born machine framework, we propose a general method for constructing a generative model by expressing the target distribution function as the amplitude of the quantum wave function represented by a tensor tree. The key idea is dynamically optimizing the tree structure that minimizes the bond mutual information. The proposed method offers enhanced performance and uncovers hidden relational structures in the target data. We illustrate potential practical applications with four examples: (i) random patterns, (ii) QMNIST handwritten digits, (iii) Bayesian networks, and (iv) the pattern of stock price fluctuation pattern in S&P500. In (i) and (ii), the strongly correlated variables were concentrated near the center of the network; in (iii), the causality pattern was identified; and in (iv), a structure corresponding to the eleven sectors emerged.

LGApr 9, 2025
Plastic tensor networks for interpretable generative modeling

Katsuya O. Akamatsu, Kenji Harada, Tsuyoshi Okubo et al.

A structural optimization scheme for a single-layer nonnegative adaptive tensor tree (NATT) that models a target probability distribution is proposed as an alternative paradigm for generative modeling. The NATT scheme, by construction, automatically searches for a tree structure that best fits a given discrete dataset whose features serve as inputs, and has the advantage that it is interpretable as a probabilistic graphical model. We consider the NATT scheme and a recently proposed Born machine adaptive tensor tree (BMATT) optimization scheme and demonstrate their effectiveness on a variety of generative modeling tasks where the objective is to infer the hidden structure of a provided dataset. Our results show that in terms of minimizing the negative log-likelihood, the single-layer scheme has model performance comparable to the Born machine scheme, though not better. The tasks include deducing the structure of binary bitwise operations, learning the internal structure of random Bayesian networks given only visible sites, and a real-world example related to hierarchical clustering where a cladogram is constructed from mitochondrial DNA sequences. In doing so, we also show the importance of the choice of network topology and the versatility of a least-mutual information criterion in selecting a candidate structure for a tensor tree, as well as discuss aspects of these tensor tree generative models including their information content and interpretability.