74.9ROMar 10
ReTac-ACT: A State-Gated Vision-Tactile Fusion Transformer for Precision AssemblyMinchi Ruan, LiangQing Zhou, Hongtong Li et al.
Precision assembly requires sub-millimeter corrections in contact-rich "last-millimeter" regions where visual feedback fails due to occlusion from the end-effector and workpiece. We present ReTac-ACT (Reconstruction-enhanced Tactile ACT), a vision-tactile imitation learning policy that addresses this challenge through three synergistic mechanisms: (i) bidirectional cross-attention enabling reciprocal visuo-tactile feature enhancement before fusion, (ii) a proprioception-conditioned gating network that dynamically elevates tactile reliance when visual occlusion occurs, and (iii) a tactile reconstruction objective enforcing learning of manipulation-relevant contact information rather than generic visual textures. Evaluated on the standardized NIST Assembly Task Board M1 benchmark, ReTac-ACT achieves 90% peg-in-hole success, substantially outperforming vision-only and generalist baseline methods, and maintains 80% success at industrial-grade 0.1mm clearance. Ablation studies validate that each architectural component is indispensable. The ReTac-ACT codebase and a vision-tactile demonstration dataset covering various clearance levels with both visual and tactile features will be released to support reproducible research.
HCAug 2, 2021
Leveraging Multiple Legacy Wi-Fi Links for Human Behavior SensingLingchao Guo, Zhaoming Lu, Xiangming Wen et al.
Taking advantage of the rich information provided by Wi-Fi measurement setups, Wi-Fi-based human behavior sensing leveraging Channel State Information (CSI) measurements has received a lot of research attention in recent years. The CSI-based human sensing algorithms typically either rely on an explicit channel propagation model or, more recently, adopt machine learning so as to robustify feature extraction. In most related work, the considered CSI is extracted from a single dedicated Access Point (AP) communication setup. In this paper, we consider a more realistic setting where a legacy network of multiple APs is already deployed for communications purposes and leveraged for sensing benefits using machine learning. The use of legacy network presents challenges and opportunities as many Wi-Fi links can present with richer yet unequally useful data sets. In order to break the curse of dimensionality associated with training over a too large dimensional CSI, we propose a link selection mechanism based on Reinforcement Learning (RL) which allows for dimension reduction while preserving the data that is most relevant for human behavior sensing. The method is based on a sequential state decision-making process in which the CSI is modeled as a part of the state. From actual experiment results, our method is shown to perform better than state-of-the-art approaches in a scenario with multiple available Wi-Fi links.
CVDec 28, 2020
From Point to Space: 3D Moving Human Pose Estimation Using Commodity WiFiYiming Wang, Lingchao Guo, Zhaoming Lu et al.
In this paper, we present Wi-Mose, the first 3D moving human pose estimation system using commodity WiFi. Previous WiFi-based works have achieved 2D and 3D pose estimation. These solutions either capture poses from one perspective or construct poses of people who are at a fixed point, preventing their wide adoption in daily scenarios. To reconstruct 3D poses of people who move throughout the space rather than a fixed point, we fuse the amplitude and phase into Channel State Information (CSI) images which can provide both pose and position information. Besides, we design a neural network to extract features that are only associated with poses from CSI images and then convert the features into key-point coordinates. Experimental results show that Wi-Mose can localize key-point with 29.7mm and 37.8mm Procrustes analysis Mean Per Joint Position Error (P-MPJPE) in the Line of Sight (LoS) and Non-Line of Sight (NLoS) scenarios, respectively, achieving higher performance than the state-of-the-art method. The results indicate that Wi-Mose can capture high-precision 3D human poses throughout the space.
CVDec 22, 2020
Subject-independent Human Pose Image Construction with Commodity Wi-FiShuang Zhou, Lingchao Guo, Zhaoming Lu et al.
Recently, commodity Wi-Fi devices have been shown to be able to construct human pose images, i.e., human skeletons, as fine-grained as cameras. Existing papers achieve good results when constructing the images of subjects who are in the prior training samples. However, the performance drops when it comes to new subjects, i.e., the subjects who are not in the training samples. This paper focuses on solving the subject-generalization problem in human pose image construction. To this end, we define the subject as the domain. Then we design a Domain-Independent Neural Network (DINN) to extract subject-independent features and convert them into fine-grained human pose images. We also propose a novel training method to train the DINN and it has no re-training overhead comparing with the domain-adversarial approach. We build a prototype system and experimental results demonstrate that our system can construct fine-grained human pose images of new subjects with commodity Wi-Fi in both the visible and through-wall scenarios, which shows the effectiveness and the subject-generalization ability of our model.
SPNov 13, 2020
CentiTrack: Towards Centimeter-Level Passive Gesture Tracking with Commodity WiFiZijun Han, Zhaoming Lu, Xiangming Wen et al.
Gesture awareness plays a crucial role in promoting human-computer interface. Previous works either depend on customized hardware or need a priori learning of wireless signal patterns, facing downsides in terms of the privacy concern, availability and reliability. In this paper, we propose CentiTrack, the first centimeter-level passive gesture tracking system that works with only three commodityWiFi devices, without any extra hardware modifications or wearable sensors. To this end, we first identify the Channel State Information (CSI) measurement error sources in the physical layer process, and then denoise CSI by the complex ratio between adjacent antennas. Principal Component Analysis (PCA) is further adopted to separate the reflected signals from noises. Benchmark experiments are conducted to verify that the phase changes of denoised CSI are proportional to the length changes of dynamic path reflected off the hand. In addition, we adopt the Multiple Signal Classification (MUSIC) algorithm to estimate the Angle-of-Arrivals (AoAs) of dynamic paths, and then locate the initial position of hands with triangulation. We also propose a novel static componnets elimination algorithm for tracking correction by eliminating the components unrelated to motion. A prototype of CentiTrack is fully realized and evaluated in various real scenarios. Extensive experiments show that CentiTrack is superior in terms of tracking accuracy, sensing range and device cost, compared with the state-of-the-arts.
SPSep 21, 2020
When Healthcare Meets Off-the-Shelf WiFi: A Non-Wearable and Low-Costs Approach for In-Home MonitoringLingchao Guo, Zhaoming Lu, Shuang Zhou et al.
As elderly population grows, social and health care begin to face validation challenges, in-home monitoring is becoming a focus for professionals in the field. Governments urgently need to improve the quality of healthcare services at lower costs while ensuring the comfort and independence of the elderly. This work presents an in-home monitoring approach based on off-the-shelf WiFi, which is low-costs, non-wearable and makes all-round daily healthcare information available to caregivers. The proposed approach can capture fine-grained human pose figures even through a wall and track detailed respiration status simultaneously by off-the-shelf WiFi devices. Based on them, behavioral data, physiological data and the derived information (e.g., abnormal events and underlying diseases), of the elderly could be seen by caregivers directly. We design a series of signal processing methods and a neural network to capture human pose figures and extract respiration status curves from WiFi Channel State Information (CSI). Extensive experiments are conducted and according to the results, off-the-shelf WiFi devices are capable of capturing fine-grained human pose figures, similar to cameras, even through a wall and track accurate respiration status, thus demonstrating the effectiveness and feasibility of our approach for in-home monitoring.