João Correia

h-index15

14papers

116citations

Novelty43%

AI Score45

Ranked #66,081 of 205,806 authors (top 32%)#153 in NE (top 13%)

14 Papers

NENov 5, 2025

Evolutionary Optimization Trumps Adam Optimization on Embedding Space Exploration

Domício Pereira Neto, João Correia, Penousal Machado

Deep generative models, especially diffusion architectures, have transformed image generation; however, they are challenging to control and optimize for specific goals without expensive retraining. Embedding Space Exploration, especially with Evolutionary Algorithms (EAs), has been shown to be a promising method for optimizing image generation, particularly within Diffusion Models. Therefore, in this work, we study the performance of an evolutionary optimization method, namely Separable Covariance Matrix Adaptation Evolution Strategy (sep-CMA-ES), against the widely adopted Adaptive Moment Estimation (Adam), applied to Stable Diffusion XL Turbo's prompt embedding vector. The evaluation of images combines the LAION Aesthetic Predictor V2 with CLIPScore into a weighted fitness function, allowing flexible trade-offs between visual appeal and adherence to prompts. Experiments on a subset of the Parti Prompts (P2) dataset showcase that sep-CMA-ES consistently yields superior improvements in aesthetic and alignment metrics in comparison to Adam. Results indicate that the evolutionary method provides efficient, gradient-free optimization for diffusion models, enhancing controllability without the need for fine-tuning. This study emphasizes the potential of evolutionary methods for embedding space exploration of deep generative models and outlines future research directions.

NEMay 25, 2021Code

Speed Benchmarking of Genetic Programming Frameworks

Francisco Baeta, João Correia, Tiago Martins et al.

Genetic Programming (GP) is known to suffer from the burden of being computationally expensive by design. While, over the years, many techniques have been developed to mitigate this issue, data vectorization, in particular, is arguably still the most attractive strategy due to the parallel nature of GP. In this work, we employ a series of benchmarks meant to compare both the performance and evolution capabilities of different vectorized and iterative implementation approaches across several existing frameworks. Namely, TensorGP, a novel open-source engine written in Python, is shown to greatly benefit from the TensorFlow library to accelerate the domain evaluation phase in GP. The presented performance benchmarks demonstrate that the TensorGP engine manages to pull ahead, with relative speedups above two orders of magnitude for problems with a higher number of fitness cases. Additionally, as a consequence of being able to compute larger domains, we argue that TensorGP performance gains aid the discovery of more accurate candidate solutions.

29.6AIApr 10

Evolutionary Token-Level Prompt Optimization for Diffusion Models

Domício Pereira Neto, João Correia, Penousal Machado

Text-to-image diffusion models exhibit strong generative performance but remain highly sensitive to prompt formulation, often requiring extensive manual trial and error to obtain satisfactory results. This motivates the development of automated, model-agnostic prompt optimization methods that can systematically explore the conditioning space beyond conventional text rewriting. This work investigates the use of a Genetic Algorithm (GA) for prompt optimization by directly evolving the token vectors employed by CLIP-based diffusion models. The GA optimizes a fitness function that combines aesthetic quality, measured by the LAION Aesthetic Predictor V2, with prompt-image alignment, assessed via CLIPScore. Experiments on 36 prompts from the Parti Prompts (P2) dataset show that the proposed approach outperforms the baseline methods, including Promptist and random search, achieving up to a 23.93% improvement in fitness. Overall, the method is adaptable to image generation models with tokenized text encoders and provides a modular framework for future extensions, the limitations and prospects of which are discussed.

12.3LGApr 11

Exploring the impact of fairness-aware criteria in AutoML

Joana Simões, João Correia

Machine Learning (ML) systems are increasingly used to support decision-making processes that affect individuals. However, these systems often rely on biased data, which can lead to unfair outcomes against specific groups. With the growing adoption of Automated Machine Learning (AutoML), the risk of intensifying discriminatory behaviours increases, as most frameworks primarily focus on model selection to maximise predictive performance. Previous research on fairness in AutoML had largely followed this trend, integrating fairness awareness only in the model selection or hyperparameter tuning, while neglecting other critical stages of the ML pipeline. This paper aims to study the impact of integrating fairness directly into the optimisation component of an AutoML framework that constructs complete ML pipelines, from data selection and transformations to model selection and tuning. As selecting appropriate fairness metrics remains a key challenge, our work incorporates complementary fairness metrics to capture different dimensions of fairness during the optimisation. Their integration within AutoML resulted in measurable differences compared to a baseline focused solely on predictive performance. Despite a 9.4% decrease in predictive power, the average fairness improved by 14.5%, accompanied by a 35.7% reduction in data usage. Furthermore, fairness integration produced complete yet simpler final solutions, suggesting that model complexity is not always required to achieve balanced and fair ML solutions.

LGMar 6, 2025

EDCA - An Evolutionary Data-Centric AutoML Framework for Efficient Pipelines

Joana Simões, João Correia

Automated Machine Learning (AutoML) gained popularity due to the increased demand for Machine Learning (ML) specialists, allowing them to apply ML techniques effortlessly and quickly. AutoML implementations use optimisation methods to identify the most effective ML solution for a given dataset, aiming to improve one or more predefined metrics. However, most implementations focus on model selection and hyperparameter tuning. Despite being an important factor in obtaining high-performance ML systems, data quality is usually an overlooked part of AutoML and continues to be a manual and time-consuming task. This work presents EDCA, an Evolutionary Data Centric AutoML framework. In addition to the traditional tasks such as selecting the best models and hyperparameters, EDCA enhances the given data by optimising data processing tasks such as data reduction and cleaning according to the problems' needs. All these steps create an ML pipeline that is optimised by an evolutionary algorithm. To assess its effectiveness, EDCA was compared to FLAML and TPOT, two frameworks at the top of the AutoML benchmarks. The frameworks were evaluated in the same conditions using datasets from AMLB classification benchmarks. EDCA achieved statistically similar results in performance to FLAML and TPOT but used significantly less data to train the final solutions. Moreover, EDCA experimental results reveal that a good performance can be achieved using less data and efficient ML algorithm aspects that align with Green AutoML guidelines

NEApr 9, 2025

Evolutionary Machine Learning meets Self-Supervised Learning: a comprehensive survey

Adriano Vinhas, João Correia, Penousal Machado

The number of studies that combine Evolutionary Machine Learning and self-supervised learning has been growing steadily in recent years. Evolutionary Machine Learning has been shown to help automate the design of machine learning algorithms and to lead to more reliable solutions. Self-supervised learning, on the other hand, has produced good results in learning useful features when labelled data is limited. This suggests that the combination of these two areas can help both in shaping evolutionary processes and in automating the design of deep neural networks, while also reducing the need for labelled data. Still, there are no detailed reviews that explain how Evolutionary Machine Learning and self-supervised learning can be used together. To help with this, we provide an overview of studies that bring these areas together. Based on this growing interest and the range of existing works, we suggest a new sub-area of research, which we call Evolutionary Self-Supervised Learning and introduce a taxonomy for it. Finally, we point out some of the main challenges and suggest directions for future research to help Evolutionary Self-Supervised Learning grow and mature as a field.

NEJun 20, 2024

Towards evolution of Deep Neural Networks through contrastive Self-Supervised learning

Adriano Vinhas, João Correia, Penousal Machado

Deep Neural Networks (DNNs) have been successfully applied to a wide range of problems. However, two main limitations are commonly pointed out. The first one is that they require long time to design. The other is that they heavily rely on labelled data, which can sometimes be costly and hard to obtain. In order to address the first problem, neuroevolution has been proved to be a plausible option to automate the design of DNNs. As for the second problem, self-supervised learning has been used to leverage unlabelled data to learn representations. Our goal is to study how neuroevolution can help self-supervised learning to bridge the gap to supervised learning in terms of performance. In this work, we propose a framework that is able to evolve deep neural networks using self-supervised learning. Our results on the CIFAR-10 dataset show that it is possible to evolve adequate neural networks while reducing the reliance on labelled data. Moreover, an analysis to the structure of the evolved networks suggests that the amount of labelled data fed to them has less effect on the structure of networks that learned via self-supervised learning, when compared to individuals that relied on supervised learning.

AIMar 12, 2021

TensorGP -- Genetic Programming Engine in TensorFlow

Francisco Baeta, João Correia, Tiago Martins et al.

In this paper, we resort to the TensorFlow framework to investigate the benefits of applying data vectorization and fitness caching methods to domain evaluation in Genetic Programming. For this purpose, an independent engine was developed, TensorGP, along with a testing suite to extract comparative timing results across different architectures and amongst both iterative and vectorized approaches. Our performance benchmarks demonstrate that by exploiting the TensorFlow eager execution model, performance gains of up to two orders of magnitude can be achieved on a parallel approach running on dedicated hardware when compared to a standard iterative approach.

NEJan 31, 2021

Demonstrating the Evolution of GANs through t-SNE

Victor Costa, Nuno Lourenço, João Correia et al.

Generative Adversarial Networks (GANs) are powerful generative models that achieved strong results, mainly in the image domain. However, the training of GANs is not trivial, presenting some challenges tackled by different strategies. Evolutionary algorithms, such as COEGAN, were recently proposed as a solution to improve the GAN training, overcoming common problems that affect the model, such as vanishing gradient and mode collapse. In this work, we propose an evaluation method based on t-distributed Stochastic Neighbour Embedding (t-SNE) to assess the progress of GANs and visualize the distribution learned by generators in training. We propose the use of the feature space extracted from trained discriminators to evaluate samples produced by generators and from the input dataset. A metric based on the resulting t-SNE maps and the Jaccard index is proposed to represent the model quality. Experiments were conducted to assess the progress of GANs when trained using COEGAN. The results show both by visual inspection and metrics that the Evolutionary Algorithm gradually improves discriminators and generators through generations, avoiding problems such as mode collapse.

NEJul 13, 2020

Exploring the Evolution of GANs through Quality Diversity

Victor Costa, Nuno Lourenço, João Correia et al.

Generative adversarial networks (GANs) achieved relevant advances in the field of generative algorithms, presenting high-quality results mainly in the context of images. However, GANs are hard to train, and several aspects of the model should be previously designed by hand to ensure training success. In this context, evolutionary algorithms such as COEGAN were proposed to solve the challenges in GAN training. Nevertheless, the lack of diversity and premature optimization can be found in some of these solutions. We propose in this paper the application of a quality-diversity algorithm in the evolution of GANs. The solution is based on the Novelty Search with Local Competition (NSLC) algorithm, adapting the concepts used in COEGAN to this new proposal. We compare our proposal with the original COEGAN model and with an alternative version using a global competition approach. The experimental results evidenced that our proposal increases the diversity of the discovered solutions and leverage the performance of the models found by the algorithm. Furthermore, the global competition approach was able to consistently find better models for GANs.

NEApr 9, 2020

Using Skill Rating as Fitness on the Evolution of GANs

Victor Costa, Nuno Lourenço, João Correia et al.

Generative Adversarial Networks (GANs) are an adversarial model that achieved impressive results on generative tasks. In spite of the relevant results, GANs present some challenges regarding stability, making the training usually a hit-and-miss process. To overcome these challenges, several improvements were proposed to better handle the internal characteristics of the model, such as alternative loss functions or architectural changes on the neural networks used by the generator and the discriminator. Recent works proposed the use of evolutionary algorithms on GAN training, aiming to solve these challenges and to provide an automatic way to find good models. In this context, COEGAN proposes the use of coevolution and neuroevolution to orchestrate the training of GANs. However, previous experiments detected that some of the fitness functions used to guide the evolution are not ideal. In this work we propose the evaluation of a game-based fitness function to be used within the COEGAN method. Skill rating is a metric to quantify the skill of players in a game and has already been used to evaluate GANs. We extend this idea using the skill rating in an evolutionary algorithm to train GANs. The results show that skill rating can be used as fitness to guide the evolution in COEGAN without the dependence of an external evaluator.

NEDec 12, 2019

COEGAN: Evaluating the Coevolution Effect in Generative Adversarial Networks

Victor Costa, Nuno Lourenço, João Correia et al.

Generative adversarial networks (GAN) present state-of-the-art results in the generation of samples following the distribution of the input dataset. However, GANs are difficult to train, and several aspects of the model should be previously designed by hand. Neuroevolution is a well-known technique used to provide the automatic design of network architectures which was recently expanded to deep neural networks. COEGAN is a model that uses neuroevolution and coevolution in the GAN training algorithm to provide a more stable training method and the automatic design of neural network architectures. COEGAN makes use of the adversarial aspect of the GAN components to implement coevolutionary strategies in the training algorithm. Our proposal was evaluated in the Fashion-MNIST and MNIST dataset. We compare our results with a baseline based on DCGAN and also with results from a random search algorithm. We show that our method is able to discover efficient architectures in the Fashion-MNIST and MNIST datasets. The results also suggest that COEGAN can be used as a training algorithm for GANs to avoid common issues, such as the mode collapse problem.

NEMay 9, 2019

Automatic Design of Artificial Neural Networks for Gamma-Ray Detection

Filipe Assunção, João Correia, Rúben Conceição et al.

The goal of this work is to investigate the possibility of improving current gamma/hadron discrimination based on their shower patterns recorded on the ground. To this end we propose the use of Convolutional Neural Networks (CNNs) for their ability to distinguish patterns based on automatically designed features. In order to promote the creation of CNNs that properly uncover the hidden patterns in the data, and at same time avoid the burden of hand-crafting the topology and learning hyper-parameters we resort to NeuroEvolution; in particular we use Fast-DENSER++, a variant of Deep Evolutionary Network Structured Representation. The results show that the best CNN generated by Fast-DENSER++ improves by a factor of 2 when compared with the results reported by classic statistical approaches. Additionally, we experiment ensembling the 10 best generated CNNs, one from each of the evolutionary runs; the ensemble leads to an improvement by a factor of 2.3. These results show that it is possible to improve the gamma/hadron discrimination based on CNNs that are automatically generated and are trained with instances of the ground impact patterns.

NEJun 26, 2018

Evotype: Towards the Evolution of Type Stencils

Tiago Martins, João Correia, Ernesto Costa et al.

Typefaces are an essential resource employed by graphic designers. The increasing demand for innovative type design work increases the need for good technological means to assist the designer in the creation of a typeface. We present an evolutionary computation approach for the generation of type stencils to draw coherent glyphs for different characters. The proposed system employs a Genetic Algorithm to evolve populations of type stencils. The evaluation of each candidate stencil uses a hill climbing algorithm to search the best configurations to draw the target glyphs. We study the interplay between legibility, coherence and expressiveness, and show how our framework can be used in practice.