HCOct 4, 2024
Artificial Human Lecturers: Initial Findings From Asia's First AI Lecturers in Class to Promote Innovation in EducationChing Christie Pang, Yawei Zhao, Zhizhuo Yin et al.
In recent years, artificial intelligence (AI) has become increasingly integrated into education, reshaping traditional learning environments. Despite this, there has been limited investigation into fully operational artificial human lecturers. To the best of our knowledge, our paper presents the world's first study examining their deployment in a real-world educational setting. Specifically, we investigate the use of "digital teachers," AI-powered virtual lecturers, in a postgraduate course at the Hong Kong University of Science and Technology (HKUST). Our study explores how features such as appearance, non-verbal cues, voice, and verbal expression impact students' learning experiences. Findings suggest that students highly value naturalness, authenticity, and interactivity in digital teachers, highlighting areas for improvement, such as increased responsiveness, personalized avatars, and integration with larger learning platforms. We conclude that digital teachers have significant potential to enhance education by providing a more flexible, engaging, personalized, and accessible learning experience for students.
GRMay 13, 2025
M3G: Multi-Granular Gesture Generator for Audio-Driven Full-Body Human Motion SynthesisZhizhuo Yin, Yuk Hang Tsui, Pan Hui
Generating full-body human gestures encompassing face, body, hands, and global movements from audio is a valuable yet challenging task in virtual avatar creation. Previous systems focused on tokenizing the human gestures framewisely and predicting the tokens of each frame from the input audio. However, one observation is that the number of frames required for a complete expressive human gesture, defined as granularity, varies among different human gesture patterns. Existing systems fail to model these gesture patterns due to the fixed granularity of their gesture tokens. To solve this problem, we propose a novel framework named Multi-Granular Gesture Generator (M3G) for audio-driven holistic gesture generation. In M3G, we propose a novel Multi-Granular VQ-VAE (MGVQ-VAE) to tokenize motion patterns and reconstruct motion sequences from different temporal granularities. Subsequently, we proposed a multi-granular token predictor that extracts multi-granular information from audio and predicts the corresponding motion tokens. Then M3G reconstructs the human gestures from the predicted tokens using the MGVQ-VAE. Both objective and subjective experiments demonstrate that our proposed M3G framework outperforms the state-of-the-art methods in terms of generating natural and expressive full-body human gestures.
HCFeb 15, 2024
Pinning "Reflection" on the Agenda: Investigating Reflection in Human-LLM Co-Creation for Creative CodingAnqi Wang, Zhizhuo Yin, Yulu Hu et al.
Large language models (LLMs) are increasingly integrated into creative coding, yet how users reflect, and how different co-creation conditions influence reflective behavior, remains underexplored. This study investigates situated, moment-to-moment reflection in creative coding under two prompting strategies: the entire task invocation (T1) and decomposed subtask invocation (T2), to examine their effects on reflective behavior. Our mixed-method results reveal three distinct reflection types and show that T2 encourages more frequent, strategic, and generative reflection, fostering diagnostic reasoning and goal redefinition. These findings offer insights into how LLM-based tools foster deeper creative engagement through structured, behaviorally grounded reflection support.
CLJan 24, 2024
APT-Pipe: A Prompt-Tuning Tool for Social Data Annotation using ChatGPTYiming Zhu, Zhizhuo Yin, Gareth Tyson et al.
Recent research has highlighted the potential of LLM applications, like ChatGPT, for performing label annotation on social computing text. However, it is already well known that performance hinges on the quality of the input prompts. To address this, there has been a flurry of research into prompt tuning -- techniques and guidelines that attempt to improve the quality of prompts. Yet these largely rely on manual effort and prior knowledge of the dataset being annotated. To address this limitation, we propose APT-Pipe, an automated prompt-tuning pipeline. APT-Pipe aims to automatically tune prompts to enhance ChatGPT's text classification performance on any given dataset. We implement APT-Pipe and test it across twelve distinct text classification datasets. We find that prompts tuned by APT-Pipe help ChatGPT achieve higher weighted F1-score on nine out of twelve experimented datasets, with an improvement of 7.01% on average. We further highlight APT-Pipe's flexibility as a framework by showing how it can be extended to support additional tuning mechanisms.