Giovanni di Sarra

h-index3
2papers

2 Papers

21.2DIS-NNMay 18
Activation Functions, Statistics and Learning of Higher-Order Interactions in Restricted Boltzmann Machines

Giovanni di Sarra, Yasser Roudi

The great success of neural networks in recognizing hidden patterns and correlations in complex data lies in the way they take advantage of the large number of parameters and nonlinear single-unit activation, jointly. Restricted Boltzmann Machines (RBMs) provide a simple yet powerful framework for studying the impact of activation nonlinearities on performance and representation. In this work, we exploit the duality between RBMs and models of interacting binary variables to study the statistics of the interactions induced by RBM ensembles with different hidden unit activation functions. We characterize the space of representable models analytically in terms of moments of the distribution of induced interactions for four commonly used activation functions: Linear, Step, ReLU, and Exponential. Quantitative predictions of the analytical calculations on learning show a very good agreement with results of the simulations of the training process. In particular, our analysis shows that there are certain data structures, namely those generated by models of interacting variables with large interaction terms beyond pairwise, that are difficult to represent, and thus to learn, for any RBM. Yet, we find that rapidly increasing nonlinearities, such as the Exponential function, can facilitate the representation and learning of such data structures for a specific range of parameters that is determined analytically.

DIS-NNJan 8, 2025
The unbearable lightness of Restricted Boltzmann Machines: Theoretical Insights and Biological Applications

Giovanni di Sarra, Barbara Bravi, Yasser Roudi

Restricted Boltzmann Machines are simple yet powerful neural networks. They can be used for learning structure in data, and are used as a building block of more complex neural architectures. At the same time, their simplicity makes them easy to use, amenable to theoretical analysis, yielding interpretable models in applications. Here, we focus on reviewing the role that the activation functions, describing the input-output relationship of single neurons in RBM, play in the functionality of these models. We discuss recent theoretical results on the benefits and limitations of different activation functions. We also review applications to biological data analysis, namely neural data analysis, where RBM units are mostly taken to have sigmoid activation functions and binary units, to protein data analysis and immunology where non-binary units and non-sigmoid activation functions have recently been shown to yield important insights into the data. Finally, we discuss open problems addressing which can shed light on broader issues in neural network research.