Tatsuya Daikoku

AI
h-index4
3papers
7citations
Novelty47%
AI Score41

3 Papers

AIMay 14
AI Outperforms Humans in Personalized Image Aesthetics Assessment via LLM-Based Interviews and Semantic Feature Extraction

Yoshia Abe, Tatsuya Daikoku, Yasuo Kuniyoshi

Accurately predicting individual aesthetic evaluation for images is a fundamental challenge for AI. Various deep learning (DL)-based models have been proposed for this task, training on image evaluation data to extract objective low-level features. However, aesthetic preferences are inherently subjective and individual-dependent. Accurate prediction thus requires the extraction of high-level semantic features of images and the active collection of preference information from the target individual. To address this issue, we focus on the utility of Large Language Models (LLMs) pretrained on vast amounts of textual data, and develop an integrated DL-LLM system. The system actively elicits aesthetic preferences through LLM-based semi-structured interviews and predicts aesthetic evaluation by leveraging both low-level and high-level features. In our experiments, we compare the proposed system against conventional systems, human predictors, and the target individual's own re-evaluations after a certain time interval. Our results show that the proposed system outperforms all of them, with particularly strong performance on highly-rated images. Moreover, the prediction error of the proposed system is smaller than within-person variability, while human predictors show the largest error, likely due to the influence of their own aesthetic values. These results suggest that AI may be better positioned than others or one's future self to capture individual aesthetic preferences at a given point. This opens a new question of whether AI could serve as a deeper interpreter of human aesthetic sensibility than humans themselves.

AIMar 6, 2024
Assessing the Aesthetic Evaluation Capabilities of GPT-4 with Vision: Insights from Group and Individual Assessments

Yoshia Abe, Tatsuya Daikoku, Yasuo Kuniyoshi

Recently, it has been recognized that large language models demonstrate high performance on various intellectual tasks. However, few studies have investigated alignment with humans in behaviors that involve sensibility, such as aesthetic evaluation. This study investigates the performance of GPT-4 with Vision, a state-of-the-art language model that can handle image input, on the task of aesthetic evaluation of images. We employ two tasks, prediction of the average evaluation values of a group and an individual's evaluation values. We investigate the performance of GPT-4 with Vision by exploring prompts and analyzing prediction behaviors. Experimental results reveal GPT-4 with Vision's superior performance in predicting aesthetic evaluations and the nature of different responses to beauty and ugliness. Finally, we discuss developing an AI system for aesthetic evaluation based on scientific knowledge of the human perception of beauty, employing agent technologies that integrate traditional deep learning models with large language models.

HCApr 5
Interoceptive Divergence in Aesthetic Evaluation and Implications for Human-AI Alignment

Yoshia Abe, Tatsuya Daikoku, Yasuo Kuniyoshi

Artificial intelligence (AI), exemplified by large language models (LLMs), is rapidly approaching and in some cases surpassing human performance across a wide range of cognitive tasks. However, human nature is not limited to intelligence alone; it also encompasses sensibility, including the capacity to perceive and experience beauty in visual scenes. This raises a fundamental question: how humans and AI systems converge or diverge in such aesthetic experiences. Aesthetic evaluation depends not only on objective properties of images but also on internal processes within the observer. As part of ongoing efforts in AI alignment, building upon prior human studies that have examined the relationship between beauty ratings, bodily sensations, and emotions, we adopt a comparable set of questionnaire items and present them to LLMs, enabling a direct comparison between human and AI responses. Our comparative analyses revealed that, while humans and AI exhibited broadly similar patterns in the correlations between beauty ratings and emotions, as well as in the image features they prioritized, notable divergences emerged in both the distribution of emotional responses and the relationship between beauty ratings and bodily sensations. These findings suggest that state-of-the-art LLMs, trained on large-scale textual data, can approximate average human tendencies in aesthetic evaluation to a certain extent. However, they also indicate limitations, particularly in relation to interoceptive aspects, which may reflect insufficient representation in training data or unintended consequences of alignment processes. These findings highlight key challenges for AI alignment and suggest important directions for developing AI systems with human-like aesthetic processing.