Stefano Nolfi

NE
h-index9
14papers
137citations
Novelty46%
AI Score41

14 Papers

AIAug 9, 2023
On the Unexpected Abilities of Large Language Models

Stefano Nolfi

Large Language Models (LLMs) are capable of displaying a wide range of abilities that are not directly connected with the task for which they are trained: predicting the next words of human-written texts. In this article, I review recent research investigating the cognitive abilities developed by LLMs and their relation to human cognition. I discuss the nature of the indirect process that leads to the acquisition of these cognitive abilities, their relation to other indirect processes, and the implications for the acquisition of integrated abilities. Moreover, I propose the factors that enable the development of abilities that are related only very indirectly to the proximal objective of the training task. Finally, I discuss whether the full set of capabilities that LLMs could possibly develop is predictable.

NEAug 4, 2022
The Role of Morphological Variation in Evolutionary Robotics: Maximizing Performance and Robustness

Jonata Tyska Carvalho, Stefano Nolfi

Exposing an Evolutionary Algorithm that is used to evolve robot controllers to variable conditions is necessary to obtain solutions which are robust and can cross the reality gap. However, we do not yet have methods for analyzing and understanding the impact of the varying morphological conditions which impact the evolutionary process, and therefore for choosing suitable variation ranges. By morphological conditions, we refer to the starting state of the robot, and to variations in its sensor readings during operation due to noise. In this article, we introduce a method that permits us to measure the impact of these morphological variations and we analyze the relation between the amplitude of variations, the modality with which they are introduced, and the performance and robustness of evolving agents. Our results demonstrate that (i) the evolutionary algorithm can tolerate morphological variations which have a very high impact, (ii) variations affecting the actions of the agent are tolerated much better than variations affecting the initial state of the agent or of the environment, and (iii) improving the accuracy of the fitness measure through multiple evaluations is not always useful. Moreover, our results show that morphological variations permit generating solutions which perform better both in varying and non-varying conditions.

AIMay 16, 2022
Qualitative Differences Between Evolutionary Strategies and Reinforcement Learning Methods for Control of Autonomous Agents

Nicola Milano, Stefano Nolfi

In this paper we analyze the qualitative differences between evolutionary strategies and reinforcement learning algorithms by focusing on two popular state-of-the-art algorithms: the OpenAI-ES evolutionary strategy and the Proximal Policy Optimization (PPO) reinforcement learning algorithm -- the most similar methods of the two families. We analyze how the methods differ with respect to: (i) general efficacy, (ii) ability to cope with sparse rewards, (iii) propensity/capacity to discover minimal solutions, (iv) dependency on reward shaping, and (v) ability to cope with variations of the environmental conditions. The analysis of the performance and of the behavioral strategies displayed by the agents trained with the two methods on benchmark problems enable us to demonstrate qualitative differences which were not identified in previous studies, to identify the relative weakness of the two methods, and to propose ways to ameliorate some of those weakness. We show that the characteristics of the reward function has a strong impact which vary qualitatively not only for the OpenAI-ES and the PPO but also for alternative reinforcement learning algorithms, thus demonstrating the importance of optimizing the characteristic of the reward function to the algorithm used.

AIJan 30
Alignment among Language, Vision and Action Representations

Nicola Milano, Stefano Nolfi

A fundamental question in cognitive science and AI concerns whether different learning modalities: language, vision, and action, give rise to distinct or shared internal representations. Traditional views assume that models trained on different data types develop specialized, non-transferable representations. However, recent evidence suggests unexpected convergence: models optimized for distinct tasks may develop similar representational geometries. We investigate whether this convergence extends to embodied action learning by training a transformer-based agent to execute goal-directed behaviors in response to natural language instructions. Using behavioral cloning on the BabyAI platform, we generated action-grounded language embeddings shaped exclusively by sensorimotor control requirements. We then compared these representations with those extracted from state-of-the-art large language models (LLaMA, Qwen, DeepSeek, BERT) and vision-language models (CLIP, BLIP). Despite substantial differences in training data, modality, and objectives, we observed robust cross-modal alignment. Action representations aligned strongly with decoder-only language models and BLIP (precision@15: 0.70-0.73), approaching the alignment observed among language models themselves. Alignment with CLIP and BERT was significantly weaker. These findings indicate that linguistic, visual, and action representations converge toward partially shared semantic structures, supporting modality-independent semantic organization and highlighting potential for cross-domain transfer in embodied AI systems.

AIJun 5, 2025
Sensory-Motor Control with Large Language Models via Iterative Policy Refinement

Jônata Tyska Carvalho, Stefano Nolfi

We propose a method that enables large language models (LLMs) to control embodied agents through the generation of control policies that directly map continuous observation vectors to continuous action vectors. At the outset, the LLMs generate a control strategy based on a textual description of the agent, its environment, and the intended goal. This strategy is then iteratively refined through a learning process in which the LLMs are repeatedly prompted to improve the current strategy, using performance feedback and sensory-motor data collected during its evaluation. The method is validated on classic control tasks from the Gymnasium library and the inverted pendulum task from the MuJoCo library. The approach proves effective with relatively compact models such as GPT-oss:120b and Qwen2.5:72b. In most cases, it successfully identifies optimal or near-optimal solutions by integrating symbolic knowledge derived through reasoning with sub-symbolic sensory-motor data gathered as the agent interacts with its environment.

NEFeb 17, 2021
Automated Curriculum Learning for Embodied Agents: A Neuroevolutionary Approach

Nicola Milano, Stefano Nolfi

We demonstrate how an evolutionary algorithm can be extended with a curriculum learning process that selects automatically the environmental conditions in which the evolving agents are evaluated. The environmental conditions are selected so to adjust the level of difficulty to the ability level of the current evolving agents and so to challenge the weaknesses of the evolving agents. The method does not require domain knowledge and does not introduce additional hyperparameters. The results collected on two benchmark problems, that require to solve a task in significantly varying environmental conditions, demonstrate that the method proposed outperforms conventional algorithms and generates solutions that are robust to variations

RONov 23, 2020
The Dynamic of Body and Brain Co-Evolution

Paolo Pagliuca, Stefano Nolfi

We introduce a method that permits to co-evolve the body and the control properties of robots. It can be used to adapt the morphological traits of robots with a hand-designed morphological bauplan or to evolve the morphological bauplan as well. Our results indicate that robots with co-adapted body and control traits outperform robots with fixed hand-designed morphologies. Interestingly, the advantage is not due to the selection of better morphologies but rather to the mutual scaffolding process that results from the possibility to co-adapt the morphological traits to the control traits and vice versa. Our results also demonstrate that morphological variations do not necessarily have destructive effects on robot skills.

LGSep 15, 2020
Autonomous Learning of Features for Control: Experiments with Embodied and Situated Agents

Nicola Milano, Stefano Nolfi

As discussed in previous studies, the efficacy of evolutionary or reinforcement learning algorithms for continuous control optimization can be enhanced by including a neural module dedicated to feature extraction trained through self-supervised methods. In this paper we report additional experiments supporting this hypothesis and we demonstrate how the advantage provided by feature extraction is not limited to problems that benefit from dimensionality reduction or that involve agents operating on the basis of allocentric perception. We introduce a method that permits to continue the training of the feature-extraction module during the training of the policy network and that increases the efficacy of feature extraction. Finally, we compare alternative feature-extracting methods and we show that sequence-to-sequence learning yields better results than the methods considered in previous studies.

NEDec 11, 2019
Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization

Paolo Pagliuca, Nicola Milano, Stefano Nolfi

We analyze the efficacy of modern neuro-evolutionary strategies for continuous control optimization. Overall, the results collected on a wide variety of qualitatively different benchmark problems indicate that these methods are generally effective and scale well with respect to the number of parameters and the complexity of the problem. Moreover, they are relatively robust with respect to the setting of hyper-parameters. The comparison of the most promising methods indicates that the OpenAI-ES algorithm outperforms or equals the other algorithms on all considered problems. Moreover, we demonstrate how the reward functions optimized for reinforcement learning methods are not necessarily effective for evolutionary strategies and vice versa. This finding can lead to reconsideration of the relative efficacy of the two classes of algorithm since it implies that the comparisons performed to date are biased toward one or the other class.

NESep 18, 2019
Long-Term Progress and Behavior Complexification in Competitive Co-Evolution

Luca Simione, Stefano Nolfi

The possibility to use competitive evolutionary algorithms to generate long-term progress is normally prevented by the convergence on limit cycle dynamics in which the evolving agents keep progressing against their current competitors by periodically rediscovering solutions adopted previously over and over again. This leads to local but not to global progress, i.e. progress against all possible competitors. We propose a new competitive algorithm that produces long-term global progress by identifying and by filtering out opportunistic variations, i.e. variations leading to progress against current competitors and retrogression against other competitors. The efficacy of the method is validated on the co-evolution of predator and prey robots, a classic problem that has been used in other related researches. The accumulation of global progress over many generations leads to effective solutions that involve the production of rather articulated behaviors. The complexity of the behavior displayed by the evolving robots increases across generations although progresses in performance are not always accompanied by behavior complexification.

NEOct 22, 2018
Scaling Up Cartesian Genetic Programming through Preferential Selection of Larger Solutions

Nicola Milano, Stefano Nolfi

We demonstrate how efficiency of Cartesian Genetic Programming method can be scaled up through the preferential selection of phenotypically larger solutions, i.e. through the preferential selection of larger solutions among equally good solutions. The advantage of the preferential selection of larger solutions is validated on the six, seven and eight-bit parity problems, on a dynamically varying problem involving the classification of binary patterns, and on the Paige regression problem. In all cases, the preferential selection of larger solutions provides an advantage in term of the performance of the evolved solutions and in term of speed, the number of evaluations required to evolve optimal or high-quality solutions. The advantage provided by the preferential selection of larger solutions can be further extended by self-adapting the mutation rate through the one-fifth success rule. Finally, for problems like the Paige regression in which neutrality plays a minor role, the advantage of the preferential selection of larger solutions can be extended by preferring larger solutions also among quasi-neutral alternative candidate solutions, i.e. solutions achieving slightly different performance.

NEOct 2, 2018
Robust Optimization through Neuroevolution

Paolo Pagliuca, Stefano Nolfi

We propose a method for evolving solutions that are robust with respect to variations of the environmental conditions (i.e. that can operate effectively in new conditions immediately, without the need to adapt to variations). The obtained results show how the method proposed is effective and computational tractable. It permits to improve performance on an extended version of the double-pole balancing problem, to outperform the best available human-designed controllers on a car racing problem, and to generate rather effective solutions for a swarm robotic problem. The comparison of different algorithms indicates that the CMA-ES and xNES methods, that operate by optimizing a distribution of parameters, represent the best options for the evolution of robust neural network controllers.

NEDec 12, 2017
Robustness, Evolvability and Phenotypic Complexity: Insights from Evolving Digital Circuits

Nicola Milano, Paolo Pagliuca, Stefano Nolfi

We show how the characteristics of the evolutionary algorithm influence the evolvability of candidate solutions, i.e. the propensity of evolving individuals to generate better solutions as a result of genetic variation. More specifically, (1+λ) evolutionary strategies largely outperform (μ+1) evolutionary strategies in the context of the evolution of digital circuits --- a domain characterized by a high level of neutrality. This difference is due to the fact that the competition for robustness to mutations among the circuits evolved with (μ+1) evolutionary strategies leads to the selection of phenotypically simple but low evolvable circuits. These circuits achieve robustness by minimizing the number of functional genes rather than by relying on redundancy or degeneracy to buffer the effects of mutations. The analysis of these factors enabled us to design a new evolutionary algorithm, named Parallel Stochastic Hill Climber (PSHC), which outperforms the other two methods considered.

NEOct 22, 2017
Moderate Environmental Variation Promotes the Evolution of Robust Solutions

Nicola Milano, Jônata Tyska Carvalho, Stefano Nolfi

Previous evolutionary studies demonstrated how evaluating evolving agents in variable environmental conditions enable them to develop solutions that are robust to environmental variation. We demonstrate how the robustness of the agents can be further improved by exposing them also to environmental variations throughout generations. These two types of environmental variations play partially distinct roles as demonstrated by the fact that agents evolved in environments that do not vary throughout generations display lower performance than agents evolved in varying environments independently from the amount of environmental variation experienced during evaluation. Moreover, our results demonstrate that performance increases when the amount of variations introduced during agents evaluation and the rate at which the environment varies throughout generations are moderate. This is explained by the fact that the probability to retain genetic variations, including non-neutral variations that alter the behavior of the agents, increases when the environment varies throughout generations but also when new environmental conditions persist over time long enough to enable genetic accommodation.