Kening Zhu

13.3HCMar 23

AnkleType: A Hands- and Eyes-free Foot-based Text Entry Technique in Virtual Reality

Xiyun Luo, Weirong Luo, Kening Zhu et al.

Virtual Reality (VR) emphasizes immersive experiences, while text entry often requires hands or visual attention, which may disrupt the interaction flows in VR. We present AnkleType, a hand- and eye-free text-entry technique that leverages ankle-based gestures for both standing and sitting situations. We began with two preliminary studies: one investigated the movement range of users' ankles, and the other elicited user-preferred ankle gestures for text-entry-related operations. The findings of these two studies guided our design of AnkleType. To optimize AnkleType's keyboard layout for eye-free input, we conducted a user study to capture the users' natural ankle spatial awareness with a computer-simulated language test. Through a pairwise comparison study, we designed a bipedal input strategy for sitting (BPSit) and a unipedal input strategy for standing (UPStand). Our first in-VR text-entry evaluation with 16 participants demonstrated that our methods could support the average typing speed from 8.99 WPM (BPSit) to 9.13 WPM (UPStand) for our first-time users. We further evaluated our design with a 7-day longitudinal study with twelve participants. Participants achieved an average typing speed of 15.05 WPM with UPStand and 16.70 WPM with BPSit in the visual condition, and 11.15 WPM and 12.87 WPM, respectively in the eyes-free condition.

CVJul 12, 2021

Visual-Tactile Cross-Modal Data Generation using Residue-Fusion GAN with Feature-Matching and Perceptual Losses

Shaoyu Cai, Kening Zhu, Yuki Ban et al.

Existing psychophysical studies have revealed that the cross-modal visual-tactile perception is common for humans performing daily activities. However, it is still challenging to build the algorithmic mapping from one modality space to another, namely the cross-modal visual-tactile data translation/generation, which could be potentially important for robotic operation. In this paper, we propose a deep-learning-based approach for cross-modal visual-tactile data generation by leveraging the framework of the generative adversarial networks (GANs). Our approach takes the visual image of a material surface as the visual data, and the accelerometer signal induced by the pen-sliding movement on the surface as the tactile data. We adopt the conditional-GAN (cGAN) structure together with the residue-fusion (RF) module, and train the model with the additional feature-matching (FM) and perceptual losses to achieve the cross-modal data generation. The experimental results show that the inclusion of the RF module, and the FM and the perceptual losses significantly improves cross-modal data generation performance in terms of the classification accuracy upon the generated data and the visual similarity between the ground-truth and the generated data.

Kening Zhu

2 Papers