NECVMay 27

Evolving to the Aesthetics of a Vision-Language Model

arXiv:2606.0011211.9h-index: 4
Predicted impact top 20% in NE · last 90 daysOriginality Synthesis-oriented
AI Analysis

For artists and designers using evolutionary systems, this work provides two VLM-based fitness functions for aesthetic evaluation, but the contribution is incremental as it applies existing methods to a new domain.

The paper explores using Vision-Language Models (VLMs) for aesthetic evaluation in evolutionary design, comparing CLIP-IQA and pairwise comparison with Glicko ranking. Results show that pairwise comparison better aligns with artist rankings, though both methods have limitations.

Evolutionary systems have demonstrated remarkable results in creative domains, with recent applications in generative typography, design, and music. However, an open problem remains in designing fitness functions that effectively capture the desired aesthetics of abstract outputs. In this work, we explore two methods for evaluating the aesthetics of a population using Vision-Language Models (VLMs). The first method uses CLIP-IQA to predict an aesthetic score for each design. The second method instead pits candidates against each other, with winners determined by a VLM using a custom prompt specified by the user. The outcomes of these pairwise comparisons are then used to estimate a population ranking via the Glicko rating system. We present these methods in the context of a case study using a custom generative system and compare the resulting rankings with an artist's aesthetic ranking and those produced by other aesthetic evaluation techniques. Additionally, we document the artist's experience using these approaches to evolve designs, critically analysing the strengths and weaknesses of both methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes