Mauro Castelli

h-index41

7papers

159citations

Novelty29%

AI Score19

Ranked #186,355 of 194,257 authors (top 96%)#960 in NE (top 90%)

7 Papers

7.3NEJun 8, 2021

GSGP-CUDA -- a CUDA framework for Geometric Semantic Genetic Programming

Leonardo Trujillo, Jose Manuel Muñoz Contreras, Daniel E Hernandez et al.

Geometric Semantic Genetic Programming (GSGP) is a state-of-the-art machine learning method based on evolutionary computation. GSGP performs search operations directly at the level of program semantics, which can be done more efficiently then operating at the syntax level like most GP systems. Efficient implementations of GSGP in C++ exploit this fact, but not to its full potential. This paper presents GSGP-CUDA, the first CUDA implementation of GSGP and the most efficient, which exploits the intrinsic parallelism of GSGP using GPUs. Results show speedups greater than 1,000X relative to the state-of-the-art sequential implementation.

10.4NEJun 3, 2021

Salp Swarm Optimization: a Critical Review

Mauro Castelli, Luca Manzoni, Luca Mariot et al.

In the crowded environment of bio-inspired population-based metaheuristics, the Salp Swarm Optimization (SSO) algorithm recently appeared and immediately gained a lot of momentum. Inspired by the peculiar spatial arrangement of salp colonies, which are displaced in long chains following a leader, this algorithm seems to provide an interesting optimization performance. However, the original work was characterized by some conceptual and mathematical flaws, which influenced all ensuing papers on the subject. In this manuscript, we perform a critical review of SSO, highlighting all the issues present in the literature and their negative effects on the optimization process carried out by this algorithm. We also propose a mathematically correct version of SSO, named Amended Salp Swarm Optimizer (ASSO) that fixes all the discussed problems. We benchmarked the performance of ASSO on a set of tailored experiments, showing that it is able to achieve better results than the original SSO. Finally, we performed an extensive study aimed at understanding whether SSO and its variants provide advantages compared to other metaheuristics. The experimental results, where SSO cannot outperform simple well-known metaheuristics, suggest that the scientific community can safely abandon SSO.

0.7CLApr 23, 2020

Towards an evolutionary-based approach for natural language processing

Luca Manzoni, Domagoj Jakobovic, Luca Mariot et al.

Tasks related to Natural Language Processing (NLP) have recently been the focus of a large research endeavor by the machine learning community. The increased interest in this area is mainly due to the success of deep learning methods. Genetic Programming (GP), however, was not under the spotlight with respect to NLP tasks. Here, we propose a first proof-of-concept that combines GP with the well established NLP tool word2vec for the next word prediction task. The main idea is that, once words have been moved into a vector space, traditional GP operators can successfully work on vectors, thus producing meaningful words as the output. To assess the suitability of this approach, we perform an experimental evaluation on a set of existing newspaper headlines. Individuals resulting from this (pre-)training phase can be employed as the initial population in other NLP tasks, like sentence generation, which will be the focus of future investigations, possibly employing adversarial co-evolutionary approaches.

2.9NEJul 3, 2017

A Distance Between Populations for n-Points Crossover in Genetic Algorithms

Mauro Castelli, Gianpiero Cattaneo, Luca Manzoni et al.

Genetic algorithms (GAs) are an optimization technique that has been successfully used on many real-world problems. There exist different approaches to their theoretical study. In this paper we complete a recently presented approach to model one-point crossover using pretopologies (or Cech topologies) in two ways. First, we extend it to the case of n-points crossover. Then, we experimentally study how the distance distribution changes when the number of crossover points increases.

4.1NEJun 19, 2017

Unsure When to Stop? Ask Your Semantic Neighbors

Ivo Gonçalves, Sara Silva, Carlos M. Fonseca et al.

In iterative supervised learning algorithms it is common to reach a point in the search where no further induction seems to be possible with the available data. If the search is continued beyond this point, the risk of overfitting increases significantly. Following the recent developments in inductive semantic stochastic methods, this paper studies the feasibility of using information gathered from the semantic neighborhood to decide when to stop the search. Two semantic stopping criteria are proposed and experimentally assessed in Geometric Semantic Genetic Programming (GSGP) and in the Semantic Learning Machine (SLM) algorithm (the equivalent algorithm for neural networks). The experiments are performed on real-world high-dimensional regression datasets. The results show that the proposed semantic stopping criteria are able to detect stopping points that result in a competitive generalization for both GSGP and SLM. This approach also yields computationally efficient algorithms as it allows the evolution of neural networks in less than 3 seconds on average, and of GP trees in at most 10 seconds. The usage of the proposed semantic stopping criteria in conjunction with the computation of optimal mutation/learning steps also results in small trees and neural networks.

2.6LGApr 27, 2017

Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes

Stefano Beretta, Mauro Castelli, Ivo Goncalves et al.

One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions, and by the fact that the problem is NP-hard. Hence, full enumeration of all the possible solutions is not always feasible and approximations are often required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before. For this reason, in this work, we provide a detailed comparison of many different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the performance of different widespread scores and algorithmic approaches proposed for the inference and the statistical pitfalls within them.

4.5NEAug 12, 2012

An Efficient Genetic Programming System with Geometric Semantic Operators and its Application to Human Oral Bioavailability Prediction

Mauro Castelli, Luca Manzoni, Leonardo Vanneschi

Very recently new genetic operators, called geometric semantic operators, have been defined for genetic programming. Contrarily to standard genetic operators, which are uniquely based on the syntax of the individuals, these new operators are based on their semantics, meaning with it the set of input-output pairs on training data. Furthermore, these operators present the interesting property of inducing a unimodal fitness landscape for every problem that consists in finding a match between given input and output data (for instance regression and classification). Nevertheless, the current definition of these operators has a serious limitation: they impose an exponential growth in the size of the individuals in the population, so their use is impossible in practice. This paper is intended to overcome this limitation, presenting a new genetic programming system that implements geometric semantic operators in an extremely efficient way. To demonstrate the power of the proposed system, we use it to solve a complex real-life application in the field of pharmacokinetic: the prediction of the human oral bioavailability of potential new drugs. Besides the excellent performances on training data, which were expected because the fitness landscape is unimodal, we also report an excellent generalization ability of the proposed system, at least for the studied application. In fact, it outperforms standard genetic programming and a wide set of other well-known machine learning methods.