NEJan 3, 2023Code
Increasing biases can be more efficient than increasing weightsCarlo Metta, Marco Fantozzi, Andrea Papini et al.
We introduce a novel computational unit for neural networks that features multiple biases, challenging the traditional perceptron structure. This unit emphasizes the importance of preserving uncorrupted information as it is passed from one unit to the next, applying activation functions later in the process with specialized biases for each unit. Through both empirical and theoretical analyses, we show that by focusing on increasing biases rather than weights, there is potential for significant enhancement in a neural network model's performance. This approach offers an alternative perspective on optimizing information flow within neural networks. See source code at https://github.com/CuriosAI/dac-dev.
LGNov 27, 2023
GloNets: Globally Connected Neural NetworksAntonio Di Cecco, Carlo Metta, Marco Fantozzi et al.
Deep learning architectures suffer from depth-related performance degradation, limiting the effective depth of neural networks. Approaches like ResNet are able to mitigate this, but they do not completely eliminate the problem. We introduce Globally Connected Neural Networks (GloNet), a novel architecture overcoming depth-related issues, designed to be superimposed on any model, enhancing its depth without increasing complexity or reducing performance. With GloNet, the network's head uniformly receives information from all parts of the network, regardless of their level of abstraction. This enables GloNet to self-regulate information flow during training, reducing the influence of less effective deeper layers, and allowing for stable training irrespective of network depth. This paper details GloNet's design, its theoretical basis, and a comparison with existing similar architectures. Experiments show GloNet's self-regulation ability and resilience to depth-related learning challenges, like performance degradation. Our findings suggest GloNet as a strong alternative to traditional architectures like ResNets.
LGSep 20, 2024
Achieving Predictive Precision: Leveraging LSTM and Pseudo Labeling for Volvo's Discovery Challenge at ECML-PKDD 2024Carlo Metta, Marco Gregnanin, Andrea Papini et al.
This paper presents the second-place methodology in the Volvo Discovery Challenge at ECML-PKDD 2024, where we used Long Short-Term Memory networks and pseudo-labeling to predict maintenance needs for a component of Volvo trucks. We processed the training data to mirror the test set structure and applied a base LSTM model to label the test data iteratively. This approach refined our model's predictive capabilities and culminated in a macro-average F1-score of 0.879, demonstrating robust performance in predictive maintenance. This work provides valuable insights for applying machine learning techniques effectively in industrial settings.
LGJun 18, 2024Code
A Systematization of the Wagner Framework: Graph Theory Conjectures and Reinforcement LearningFlora Angileri, Giulia Lombardi, Andrea Fois et al.
In 2021, Adam Zsolt Wagner proposed an approach to disprove conjectures in graph theory using Reinforcement Learning (RL). Wagner's idea can be framed as follows: consider a conjecture, such as a certain quantity f(G) < 0 for every graph G; one can then play a single-player graph-building game, where at each turn the player decides whether to add an edge or not. The game ends when all edges have been considered, resulting in a certain graph G_T, and f(G_T) is the final score of the game; RL is then used to maximize this score. This brilliant idea is as simple as innovative, and it lends itself to systematic generalization. Several different single-player graph-building games can be employed, along with various RL algorithms. Moreover, RL maximizes the cumulative reward, allowing for step-by-step rewards instead of a single final score, provided the final cumulative reward represents the quantity of interest f(G_T). In this paper, we discuss these and various other choices that can be significant in Wagner's framework. As a contribution to this systematization, we present four distinct single-player graph-building games. Each game employs both a step-by-step reward system and a single final score. We also propose a principled approach to select the most suitable neural network architecture for any given conjecture, and introduce a new dataset of graphs labeled with their Laplacian spectra. Furthermore, we provide a counterexample for a conjecture regarding the sum of the matching number and the spectral radius, which is simpler than the example provided in Wagner's original paper. The games have been implemented as environments in the Gymnasium framework, and along with the dataset, are available as open-source supplementary materials.
AIJan 31, 2022
Score vs. Winrate in Score-Based Games: which Reward for Reinforcement Learning?Luca Pasqualini, Gianluca Amato, Marco Fantozzi et al.
In the last years, the DeepMind algorithm AlphaZero has become the state of the art to efficiently tackle perfect information two-player zero-sum games with a win/lose outcome. However, when the win/lose outcome is decided by a final score difference, AlphaZero may play score-suboptimal moves because all winning final positions are equivalent from the win/lose outcome perspective. This can be an issue, for instance when used for teaching, or when trying to understand whether there is a better move. Moreover, there is the theoretical quest for the perfect game. A naive approach would be training an AlphaZero-like agent to predict score differences instead of win/lose outcomes. Since the game of Go is deterministic, this should as well produce an outcome-optimal play. However, it is a folklore belief that "this does not work". In this paper, we first provide empirical evidence for this belief. We then give a theoretical interpretation of this suboptimality in general perfect information two-player zero-sum game where the complexity of a game like Go is replaced by the randomness of the environment. We show that an outcome-optimal policy has a different preference for uncertainty when it is winning or losing. In particular, when in a losing state, an outcome-optimal agent chooses actions leading to a higher score variance. We then posit that when approximation is involved, a deterministic game behaves like a nondeterministic game, where the score variance is modeled by how uncertain the position is. We validate this hypothesis in AlphaZero-like software with a human expert.
AIMay 26, 2019
SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)Francesco Morandin, Gianluca Amato, Marco Fantozzi et al.
We develop a new model that can be applied to any perfect information two-player zero-sum game to target a high score, and thus a perfect play. We integrate this model into the Monte Carlo tree search-policy iteration learning pipeline introduced by Google DeepMind with AlphaGo. Training this model on 9x9 Go produces a superhuman Go player, thus proving that it is stable and robust. We show that this model can be used to effectively play with both positional and score handicap, and to minimize suboptimal moves. We develop a family of agents that can target high scores against any opponent, and recover from very severe disadvantage against weak opponents. To the best of our knowledge, these are the first effective achievements in this direction.
AISep 11, 2018
SAI, a Sensible Artificial Intelligence that plays GoFrancesco Morandin, Gianluca Amato, Rosa Gini et al.
We propose a multiple-komi modification of the AlphaGo Zero/Leela Zero paradigm. The winrate as a function of the komi is modeled with a two-parameters sigmoid function, so that the neural network must predict just one more variable to assess the winrate for all komi values. A second novel feature is that training is based on self-play games that occasionally branch -- with changed komi -- when the position is uneven. With this setting, reinforcement learning is showed to work on 7x7 Go, obtaining very strong playing agents. As a useful byproduct, the sigmoid parameters given by the network allow to estimate the score difference on the board, and to evaluate how much the game is decided.