SPAIAug 26, 2024

A Lightweight Human Pose Estimation Approach for Edge Computing-Enabled Metaverse with Compressive Sensing

arXiv:2409.00087v1h-index: 53
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient and accurate human pose estimation for XR and Metaverse users in wireless environments, representing an incremental improvement by combining compressive sensing with deep learning for redundancy removal.

The paper tackles the problem of transmitting IMU signals for 3D human pose estimation in edge computing-enabled Metaverse applications by proposing a compressive sensing approach that reduces redundancy and enables lightweight transmission over noisy wireless networks, achieving highly accurate poses using only 82% of the original measurements and being an order of magnitude faster than an optimization-based method.

The ability to estimate 3D movements of users over edge computing-enabled networks, such as 5G/6G networks, is a key enabler for the new era of extended reality (XR) and Metaverse applications. Recent advancements in deep learning have shown advantages over optimization techniques for estimating 3D human poses given spare measurements from sensor signals, i.e., inertial measurement unit (IMU) sensors attached to the XR devices. However, the existing works lack applicability to wireless systems, where transmitting the IMU signals over noisy wireless networks poses significant challenges. Furthermore, the potential redundancy of the IMU signals has not been considered, resulting in highly redundant transmissions. In this work, we propose a novel approach for redundancy removal and lightweight transmission of IMU signals over noisy wireless environments. Our approach utilizes a random Gaussian matrix to transform the original signal into a lower-dimensional space. By leveraging the compressive sensing theory, we have proved that the designed Gaussian matrix can project the signal into a lower-dimensional space and preserve the Set-Restricted Eigenvalue condition, subject to a power transmission constraint. Furthermore, we develop a deep generative model at the receiver to recover the original IMU signals from noisy compressed data, thus enabling the creation of 3D human body movements at the receiver for XR and Metaverse applications. Simulation results on a real-world IMU dataset show that our framework can achieve highly accurate 3D human poses of the user using only $82\%$ of the measurements from the original signals. This is comparable to an optimization-based approach, i.e., Lasso, but is an order of magnitude faster.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes