ROApr 15, 2022
Ensemble diverse hypotheses and knowledge distillation for unsupervised cross-subject adaptationKuangen Zhang, Jiahong Chen, Jing Wang et al.
Recognizing human locomotion intent and activities is important for controlling the wearable robots while walking in complex environments. However, human-robot interface signals are usually user-dependent, which causes that the classifier trained on source subjects performs poorly on new subjects. To address this issue, this paper designs the ensemble diverse hypotheses and knowledge distillation (EDHKD) method to realize unsupervised cross-subject adaptation. EDH mitigates the divergence between labeled data of source subjects and unlabeled data of target subjects to accurately classify the locomotion modes of target subjects without labeling data. Compared to previous domain adaptation methods based on the single learner, which may only learn a subset of features from input signals, EDH can learn diverse features by incorporating multiple diverse feature generators and thus increases the accuracy and decreases the variance of classifying target data, but it sacrifices the efficiency. To solve this problem, EDHKD (student) distills the knowledge from the EDH (teacher) to a single network to remain efficient and accurate. The performance of the EDHKD is theoretically proved and experimentally validated on a 2D moon dataset and two public human locomotion datasets. Experimental results show that the EDHKD outperforms all other methods. The EDHKD can classify target data with 96.9%, 94.4%, and 97.4% average accuracy on the above three datasets with a short computing time (1 ms). Compared to a benchmark (BM) method, the EDHKD increases 1.3% and 7.1% average accuracy for classifying the locomotion modes of target subjects. The EDHKD also stabilizes the learning curves. Therefore, the EDHKD is significant for increasing the generalization ability and efficiency of the human intent prediction and human activity recognition system, which will improve human-robot interactions.
SPDec 4, 2024
Multi-Branch Mutual-Distillation Transformer for EEG-Based Seizure Subtype ClassificationRuimin Peng, Zhenbang Du, Changming Zhao et al.
Cross-subject electroencephalogram (EEG) based seizure subtype classification is very important in precise epilepsy diagnostics. Deep learning is a promising solution, due to its ability to automatically extract latent patterns. However, it usually requires a large amount of training data, which may not always be available in clinical practice. This paper proposes Multi-Branch Mutual-Distillation (MBMD) Transformer for cross-subject EEG-based seizure subtype classification, which can be effectively trained from small labeled data. MBMD Transformer replaces all even-numbered encoder blocks of the vanilla Vision Transformer by our designed multi-branch encoder blocks. A mutual-distillation strategy is proposed to transfer knowledge between the raw EEG data and its wavelets of different frequency bands. Experiments on two public EEG datasets demonstrated that our proposed MBMD Transformer outperformed several traditional machine learning and state-of-the-art deep learning approaches. To our knowledge, this is the first work on knowledge distillation for EEG-based seizure subtype classification.
CVJan 8, 2024
D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose RefinementDanqi Yan, Qing Gao, Yuepeng Qian et al.
Three-dimensional (3D) human pose estimation using a monocular camera has gained increasing attention due to its ease of implementation and the abundance of data available from daily life. However, owing to the inherent depth ambiguity in images, the accuracy of existing monocular camera-based 3D pose estimation methods remains unsatisfactory, and the estimated 3D poses usually include much noise. By observing the histogram of this noise, we find each dimension of the noise follows a certain distribution, which indicates the possibility for a neural network to learn the mapping between noisy poses and ground truth poses. In this work, in order to obtain more accurate 3D poses, a Diffusion-based 3D Pose Refiner (D3PRefiner) is proposed to refine the output of any existing 3D pose estimator. We first introduce a conditional multivariate Gaussian distribution to model the distribution of noisy 3D poses, using paired 2D poses and noisy 3D poses as conditions to achieve greater accuracy. Additionally, we leverage the architecture of current diffusion models to convert the distribution of noisy 3D poses into ground truth 3D poses. To evaluate the effectiveness of the proposed method, two state-of-the-art sequence-to-sequence 3D pose estimators are used as basic 3D pose estimation models, and the proposed method is evaluated on different types of 2D poses and different lengths of the input sequence. Experimental results demonstrate the proposed architecture can significantly improve the performance of current sequence-to-sequence 3D pose estimators, with a reduction of at least 10.3% in the mean per joint position error (MPJPE) and at least 11.0% in the Procrustes MPJPE (P-MPJPE).
RODec 16, 2020
Natural grasp intention recognition based on gaze fixation in human-robot interactionBo Yang, Jian Huang, Xiaolong Li et al.
Eye movement is closely related to limb actions, so it can be used to infer movement intentions. More importantly, in some cases, eye movement is the only way for paralyzed and impaired patients with severe movement disorders to communicate and interact with the environment. Despite this, eye-tracking technology still has very limited application scenarios as an intention recognition method. The goal of this paper is to achieve a natural fixation-based grasping intention recognition method, with which a user with hand movement disorders can intuitively express what tasks he/she wants to do by directly looking at the object of interest. Toward this goal, we design experiments to study the relationships of fixations in different tasks. We propose some quantitative features from these relationships and analyze them statistically. Then we design a natural method for grasping intention recognition. The experimental results prove that the accuracy of the proposed method for the grasping intention recognition exceeds 89\% on the training objects. When this method is extendedly applied to objects not included in the training set, the average accuracy exceeds 85\%. The grasping experiment in the actual environment verifies the effectiveness of the proposed method.