CVLGMLMar 15, 2019

Through-Wall Pose Imaging in Real-Time with a Many-to-Many Encoder/Decoder Paradigm

arXiv:1904.00739v2
Originality Highly original
AI Analysis

This enables see-through vision for applications like security or rescue, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of reconstructing human pose through walls using RF signals, achieving real-time video of a 15-point skeleton with accurate and complete predictions despite visual occlusion.

Overcoming the visual barrier and developing "see-through vision" has been one of mankind's long-standing dreams. Unlike visible light, Radio Frequency (RF) signals penetrate opaque obstructions and reflect highly off humans. This paper establishes a deep-learning model that can be trained to reconstruct continuous video of a 15-point human skeleton even through visual occlusion. The training process adopts a student/teacher learning procedure inspired by the Feynman learning technique, in which video frames and RF data are first collected simultaneously using a co-located setup containing an optical camera and an RF antenna array transceiver. Next, the video frames are processed with a computer-vision-based gait analysis "teacher" module to generate ground-truth human skeletons for each frame. Then, the same type of skeleton is predicted from corresponding RF data using a "student" deep-learning model consisting of a Residual Convolutional Neural Network (CNN), Region Proposal Network (RPN), and Recurrent Neural Network with Long-Short Term Memory (LSTM) that 1) extracts spatial features from RF images, 2) detects all people present in a scene, and 3) aggregates information over many time-steps, respectively. The model is shown to both accurately and completely predict the pose of humans behind visual obstruction solely using RF signals. Primary academic contributions include the novel many-to-many imaging methodology, unique integration of RPN and LSTM networks, and original training pipeline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes