CVAug 16, 2024
DivDiff: A Conditional Diffusion Model for Diverse Human Motion PredictionHua Yu, Yaqing Hou, Wenbin Pei et al.
Diverse human motion prediction (HMP) aims to predict multiple plausible future motions given an observed human motion sequence. It is a challenging task due to the diversity of potential human motions while ensuring an accurate description of future human motions. Current solutions are either low-diversity or limited in expressiveness. Recent denoising diffusion models (DDPM) hold potential generative capabilities in generative tasks. However, introducing DDPM directly into diverse HMP incurs some issues. Although DDPM can increase the diversity of the potential patterns of human motions, the predicted human motions become implausible over time because of the significant noise disturbances in the forward process of DDPM. This phenomenon leads to the predicted human motions being hard to control, seriously impacting the quality of predicted motions and restricting their practical applicability in real-world scenarios. To alleviate this, we propose a novel conditional diffusion-based generative model, called DivDiff, to predict more diverse and realistic human motions. Specifically, the DivDiff employs DDPM as our backbone and incorporates Discrete Cosine Transform (DCT) and transformer mechanisms to encode the observed human motion sequence as a condition to instruct the reverse process of DDPM. More importantly, we design a diversified reinforcement sampling function (DRSF) to enforce human skeletal constraints on the predicted human motions. DRSF utilizes the acquired information from human skeletal as prior knowledge, thereby reducing significant disturbances introduced during the forward process. Extensive results received in the experiments on two widely-used datasets (Human3.6M and HumanEva-I) demonstrate that our model obtains competitive performance on both diversity and accuracy.
LGSep 9, 2023
Learning Spiking Neural Network from Easy to Hard taskLingling Tang, Jiangtao Hu, Hua Yu et al.
Starting with small and simple concepts, and gradually introducing complex and difficult concepts is the natural process of human learning. Spiking Neural Networks (SNNs) aim to mimic the way humans process information, but current SNNs models treat all samples equally, which does not align with the principles of human learning and overlooks the biological plausibility of SNNs. To address this, we propose a CL-SNN model that introduces Curriculum Learning(CL) into SNNs, making SNNs learn more like humans and providing higher biological interpretability. CL is a training strategy that advocates presenting easier data to models before gradually introducing more challenging data, mimicking the human learning process. We use a confidence-aware loss to measure and process the samples with different difficulty levels. By learning the confidence of different samples, the model reduces the contribution of difficult samples to parameter optimization automatically. We conducted experiments on static image datasets MNIST, Fashion-MNIST, CIFAR10, and neuromorphic datasets N-MNIST, CIFAR10-DVS, DVS-Gesture. The results are promising. To our best knowledge, this is the first proposal to enhance the biologically plausibility of SNNs by introducing CL.
NEApr 4
TransGP: Task-Conditioned Transformer-Guided Genetic Programming for Multitask Dynamic Flexible Job Shop SchedulingMeng Xu, Jiao Liu, Hua Yu et al.
Hyper-heuristics have become a popular approach for solving dynamic flexible job shop scheduling (DFJSS) problems. They use gradient-free optimization techniques like Genetic Programming (GP) to evolve non-differentiable heuristics. However, conventional GP methods tend to converge slowly because they rely solely on evolutionary search to find good heuristics. Existing multitask GP methods can solve multiple tasks simultaneously and speed up the search by transferring knowledge across similar tasks. But they mostly exchange heuristic building blocks without truly generating heuristics conditioned on task information. In this paper, we aim to accelerate convergence and enable task-specific heuristic generation by incorporating a task-conditioned Transformer model. The Transformer works in two ways. First, it learns the distribution of elite heuristics, biasing the search toward promising regions of the heuristic space. Second, through conditional generation, it produces heuristics tailored to specific tasks, allowing the model to handle multiple scheduling tasks at once and improving overall optimization efficiency. Based on these ideas, we propose TransGP, a Task-Conditioned Transformer-Guided GP framework. This evolutionary paradigm integrates generative modeling with GP, enabling efficient multitask heuristic learning and knowledge transfer. We evaluate TransGP on a range of DFJSS scenarios. Experimental results show that TransGP consistently outperforms multitask GP baselines, widely used handcrafted heuristics, and the pure Transformer model, achieving faster convergence, superior solution quality, and enhanced robustness.
CVAug 3, 2025
A Spatio-temporal Continuous Network for Stochastic 3D Human Motion PredictionHua Yu, Yaqing Hou, Xu Gui et al.
Stochastic Human Motion Prediction (HMP) has received increasing attention due to its wide applications. Despite the rapid progress in generative fields, existing methods often face challenges in learning continuous temporal dynamics and predicting stochastic motion sequences. They tend to overlook the flexibility inherent in complex human motions and are prone to mode collapse. To alleviate these issues, we propose a novel method called STCN, for stochastic and continuous human motion prediction, which consists of two stages. Specifically, in the first stage, we propose a spatio-temporal continuous network to generate smoother human motion sequences. In addition, the anchor set is innovatively introduced into the stochastic HMP task to prevent mode collapse, which refers to the potential human motion patterns. In the second stage, STCN endeavors to acquire the Gaussian mixture distribution (GMM) of observed motion sequences with the aid of the anchor set. It also focuses on the probability associated with each anchor, and employs the strategy of sampling multiple sequences from each anchor to alleviate intra-class differences in human motions. Experimental results on two widely-used datasets (Human3.6M and HumanEva-I) demonstrate that our model obtains competitive performance on both diversity and accuracy.
HCSep 24, 2025
MazeMate: An LLM-Powered Chatbot to Support Computational Thinking in Gamified Programming LearningChenyu Hou, Hua Yu, Gaoxia Zhu et al.
Computational Thinking (CT) is a foundational problem-solving skill, and gamified programming environments are a widely adopted approach to cultivating it. While large language models (LLMs) provide on-demand programming support, current applications rarely foster CT development. We present MazeMate, an LLM-powered chatbot embedded in a 3D Maze programming game, designed to deliver adaptive, context-sensitive scaffolds aligned with CT processes in maze solving and maze design. We report on the first classroom implementation with 247 undergraduates. Students rated MazeMate as moderately helpful, with higher perceived usefulness for maze solving than for maze design. Thematic analysis confirmed support for CT processes such as decomposition, abstraction, and algorithmic thinking, while also revealing limitations in supporting maze design, including mismatched suggestions and fabricated algorithmic solutions. These findings demonstrate the potential of LLM-based scaffolding to support CT and underscore directions for design refinement to enhance MazeMate usability in authentic classrooms.
GRAug 3, 2025
A Plug-and-Play Multi-Criteria Guidance for Diverse In-Betweening Human Motion GenerationHua Yu, Jiao Liu, Xu Gui et al.
In-betweening human motion generation aims to synthesize intermediate motions that transition between user-specified keyframes. In addition to maintaining smooth transitions, a crucial requirement of this task is to generate diverse motion sequences. It is still challenging to maintain diversity, particularly when it is necessary for the motions within a generated batch sampling to differ meaningfully from one another due to complex motion dynamics. In this paper, we propose a novel method, termed the Multi-Criteria Guidance with In-Betweening Motion Model (MCG-IMM), for in-betweening human motion generation. A key strength of MCG-IMM lies in its plug-and-play nature: it enhances the diversity of motions generated by pretrained models without introducing additional parameters This is achieved by providing a sampling process of pretrained generative models with multi-criteria guidance. Specifically, MCG-IMM reformulates the sampling process of pretrained generative model as a multi-criteria optimization problem, and introduces an optimization process to explore motion sequences that satisfy multiple criteria, e.g., diversity and smoothness. Moreover, our proposed plug-and-play multi-criteria guidance is compatible with different families of generative models, including denoised diffusion probabilistic models, variational autoencoders, and generative adversarial networks. Experiments on four popular human motion datasets demonstrate that MCG-IMM consistently state-of-the-art methods in in-betweening motion generation task.