HCAug 24, 2023
Language as Reality: A Co-Creative Storytelling Game Experience in 1001 Nights using Generative AIYuqian Sun, Zhouyi Li, Ke Fang et al.
In this paper, we present "1001 Nights", an AI-native game that allows players lead in-game reality through co-created storytelling with the character driven by large language model. The concept is inspired by Wittgenstein's idea of the limits of one's world being determined by the bounds of their language. Using advanced AI tools like GPT-4 and Stable Diffusion, the second iteration of the game enables the protagonist, Shahrzad, to realize words and stories in her world. The player can steer the conversation with the AI King towards specific keywords, which then become battle equipment in the game. This blend of interactive narrative and text-to-image transformation challenges the conventional border between the game world and reality through a dual perspective. We focus on Shahrzad, who seeks to alter her fate compared to the original folklore, and the player, who collaborates with AI to craft narratives and shape the game world. We explore the technical and design elements of implementing such a game with an objective to enhance the narrative game genre with AI-generated content and to delve into AI-native gameplay possibilities.
CVAug 1, 2022
Computer vision-based analysis of buildings and built environments: A systematic review of current approachesMałgorzata B. Starzyńska, Robin Roussel, Sam Jacoby et al.
Analysing 88 sources published from 2011 to 2021, this paper presents a first systematic review of the computer vision-based analysis of buildings and the built environments to assess its value to architectural and urban design studies. Following a multi-stage selection process, the types of algorithms and data sources used are discussed in respect to architectural applications such as a building classification, detail classification, qualitative environmental analysis, building condition survey, and building value estimation. This reveals current research gaps and trends, and highlights two main categories of research aims. First, to use or optimise computer vision methods for architectural image data, which can then help automate time-consuming, labour-intensive, or complex tasks of visual analysis. Second, to explore the methodological benefits of machine learning approaches to investigate new questions about the built environment by finding patterns and relationships between visual, statistical, and qualitative data, which can overcome limitations of conventional manual analysis. The growing body of research offers new methods to architectural and design studies, with the paper identifying future challenges and directions of research.
CLOct 18, 2023
AI Nushu: An Exploration of Language Emergence in Sisterhood -Through the Lens of Computational LinguisticsYuqian Sun, Yuying Tang, Ze Gao et al.
This paper presents "AI Nushu," an emerging language system inspired by Nushu (women's scripts), the unique language created and used exclusively by ancient Chinese women who were thought to be illiterate under a patriarchal society. In this interactive installation, two artificial intelligence (AI) agents are trained in the Chinese dictionary and the Nushu corpus. By continually observing their environment and communicating, these agents collaborate towards creating a standard writing system to encode Chinese. It offers an artistic interpretation of the creation of a non-western script from a computational linguistics perspective, integrating AI technology with Chinese cultural heritage and a feminist viewpoint.
AISep 20, 2023
Fictional Worlds, Real Connections: Developing Community Storytelling Social Chatbots through LLMsYuqian Sun, Hanyi Wang, Pok Man Chan et al.
We address the integration of storytelling and Large Language Models (LLMs) to develop engaging and believable Social Chatbots (SCs) in community settings. Motivated by the potential of fictional characters to enhance social interactions, we introduce Storytelling Social Chatbots (SSCs) and the concept of story engineering to transform fictional game characters into "live" social entities within player communities. Our story engineering process includes three steps: (1) Character and story creation, defining the SC's personality and worldview, (2) Presenting Live Stories to the Community, allowing the SC to recount challenges and seek suggestions, and (3) Communication with community members, enabling interaction between the SC and users. We employed the LLM GPT-3 to drive our SSC prototypes, "David" and "Catherine," and evaluated their performance in an online gaming community, "DE (Alias)," on Discord. Our mixed-method analysis, based on questionnaires (N=15) and interviews (N=8) with community members, reveals that storytelling significantly enhances the engagement and believability of SCs in community settings.
IVNov 13, 2023
DeepMetricEye: Metric Depth Estimation in Periocular VR ImageryYitong Sun, Zijian Zhou, Cyriel Diels et al.
Despite the enhanced realism and immersion provided by VR headsets, users frequently encounter adverse effects such as digital eye strain (DES), dry eye, and potential long-term visual impairment due to excessive eye stimulation from VR displays and pressure from the mask. Recent VR headsets are increasingly equipped with eye-oriented monocular cameras to segment ocular feature maps. Yet, to compute the incident light stimulus and observe periocular condition alterations, it is imperative to transform these relative measurements into metric dimensions. To bridge this gap, we propose a lightweight framework derived from the U-Net 3+ deep learning backbone that we re-optimised, to estimate measurable periocular depth maps. Compatible with any VR headset equipped with an eye-oriented monocular camera, our method reconstructs three-dimensional periocular regions, providing a metric basis for related light stimulus calculation protocols and medical guidelines. Navigating the complexities of data collection, we introduce a Dynamic Periocular Data Generation (DPDG) environment based on UE MetaHuman, which synthesises thousands of training images from a small quantity of human facial scan data. Evaluated on a sample of 36 participants, our method exhibited notable efficacy in the periocular global precision evaluation experiment, and the pupil diameter measurement.
GRSep 8, 2024
Exploring Fungal Morphology Simulation and Dynamic Light Containment from a Graphics Generation PerspectiveKexin Wang, Ivy He, Jinke Li et al.
Fungal simulation and control are considered crucial techniques in Bio-Art creation. However, coding algorithms for reliable fungal simulations have posed significant challenges for artists. This study equates fungal morphology simulation to a two-dimensional graphic time-series generation problem. We propose a zero-coding, neural network-driven cellular automaton. Fungal spread patterns are learned through an image segmentation model and a time-series prediction model, which then supervise the training of neural network cells, enabling them to replicate real-world spreading behaviors. We further implemented dynamic containment of fungal boundaries with lasers. Synchronized with the automaton, the fungus successfully spreads into pre-designed complex shapes in reality.
HCDec 14, 2025
ORIBA: Exploring LLM-Driven Role-Play Chatbot as a Creativity Support Tool for Original Character ArtistsYuqian Sun, Xingyu Li, Shunyu Yao et al.
Recent advances in Generative AI (GAI) have led to new opportunities for creativity support. However, this technology has raised ethical concerns in the visual artists community. This paper explores how GAI can assist visual artists in developing original characters (OCs) while respecting their creative agency. We present ORIBA, an AI chatbot leveraging large language models (LLMs) to enable artists to role-play with their OCs, focusing on conceptualization (e.g., backstories) while leaving exposition (visual creation) to creators. Through a study with 14 artists, we found ORIBA motivated artists' imaginative engagement, developing multidimensional attributes and stronger bonds with OCs that inspire their creative process. Our contributions include design insights for AI systems that develop from artists' perspectives, demonstrating how LLMs can support cross-modal creativity while preserving creative agency in OC art. This paper highlights the potential of GAI as a neutral, non-visual support that strengthens existing creative practice, without infringing artistic exposition.
AIJul 4, 2025
Participatory Evolution of Artificial Life Systems via Semantic FeedbackShuowen Li, Kexin Wang, Minglu Fang et al.
We present a semantic feedback framework that enables natural language to guide the evolution of artificial life systems. Integrating a prompt-to-parameter encoder, a CMA-ES optimizer, and CLIP-based evaluation, the system allows user intent to modulate both visual outcomes and underlying behavioral rules. Implemented in an interactive ecosystem simulation, the framework supports prompt refinement, multi-agent interaction, and emergent rule synthesis. User studies show improved semantic alignment over manual tuning and demonstrate the system's potential as a platform for participatory generative design and open-ended evolution.
GRFeb 7, 2020
Audio-Visual-Olfactory Resource Allocation for Tri-modal Virtual EnvironmentsEfstratios Doukakis, Kurt Debattista, Thomas Bashford-Rogers et al.
Virtual Environments (VEs) provide the opportunity to simulate a wide range of applications, from training to entertainment, in a safe and controlled manner. For applications which require realistic representations of real world environments, the VEs need to provide multiple, physically accurate sensory stimuli. However, simulating all the senses that comprise the human sensory system (HSS) is a task that requires significant computational resources. Since it is intractable to deliver all senses at the highest quality, we propose a resource distribution scheme in order to achieve an optimal perceptual experience within the given computational budgets. This paper investigates resource balancing for multi-modal scenarios composed of aural, visual and olfactory stimuli. Three experimental studies were conducted. The first experiment identified perceptual boundaries for olfactory computation. In the second experiment, participants (N=25) were asked, across a fixed number of budgets (M=5), to identify what they perceived to be the best visual, acoustic and olfactory stimulus quality for a given computational budget. Results demonstrate that participants tend to prioritise visual quality compared to other sensory stimuli. However, as the budget size is increased, users prefer a balanced distribution of resources with an increased preference for having smell impulses in the VE. Based on the collected data, a quality prediction model is proposed and its accuracy is validated against previously unused budgets and an untested scenario in a third and final experiment.
HCJan 30, 2020
Visuohaptic augmented feedback for enhancing motor skills acquisitionAli Asadipour, Kurt Debattista, Alan Chalmers
Serious games are accepted as an effective approach to deliver augmented feedback in motor (re-) learning processes. The multi-modal nature of the conventional computer games (e.g. audiovisual representation) plus the ability to interact via haptic-enabled inputs provides a more immersive experience. Thus, particular disciplines such as medical education in which frequent hands on rehearsals play a key role in learning core motor skills (e.g. physical palpations) may benefit from this technique. Challenges such as the impracticality of verbalising palpation experience by tutors and ethical considerations may prevent the medical students from correctly learning core palpation skills. This work presents a new data glove, built from off-the-shelf components which captures pressure sensitivity designed to provide feedback for palpation tasks. In this work the data glove is used to control a serious game adapted from the infinite runner genre to improve motor skill acquisition. A comparative evaluation on usability and effectiveness of the method using multimodal visualisations, as part of a larger study to enhance pressure sensitivity, is presented. Thirty participants divided into a game-playing group (n = 15) and a control group (n = 15) were invited to perform a simple palpation task. The game-playing group significantly outperformed the control group in which abstract visualisation of force was provided to the users in a blind-folded transfer test. The game-based training approach was positively described by the game-playing group as enjoyable and engaging.