Quanli Gao

h-index5
2papers

2 Papers

CVMay 2, 2024
HandS3C: 3D Hand Mesh Reconstruction with State Space Spatial Channel Attention from RGB images

Zixun Jiao, Xihan Wang, Zhaoqiang Xia et al.

Reconstructing the hand mesh from one single RGB image is a challenging task because hands are often occluded by other objects. Most previous works attempt to explore more additional information and adopt attention mechanisms for improving 3D reconstruction performance, while it would increase computational complexity simultaneously. To achieve a performance-reserving architecture with high computational efficiency, in this work, we propose a simple but effective 3D hand mesh reconstruction network (i.e., HandS3C), which is the first time to incorporate state space model into the task of hand mesh reconstruction. In the network, we design a novel state-space spatial-channel attention module that extends the effective receptive field, extracts hand features in the spatial dimension, and enhances regional features of hands in the channel dimension. This helps to reconstruct a complete and detailed hand mesh. Extensive experiments conducted on well-known datasets facing heavy occlusions (such as FREIHAND, DEXYCB, and HO3D) demonstrate that our proposed HandS3C achieves state-of-the-art performance while maintaining a minimal parameters.

CVOct 20, 2021
ESOD:Edge-based Task Scheduling for Object Detection

Yihao Wang, Ling Gao, Jie Ren et al.

Object Detection on the mobile system is a challenge in terms of everything. Nowadays, many object detection models have been designed, and most of them concentrate on precision. However, the computation burden of those models on mobile systems is unacceptable. Researchers have designed some lightweight networks for mobiles by sacrificing precision. We present a novel edge-based task scheduling framework for object detection (termed as ESOD). In detail, we train a DNN model (termed as pre-model) to predict which object detection model to use for the coming task and offloads to which edge servers by physical characteristics of the image task (e.g., brightness, saturation). The results show that ESOD can reduce latency and energy consumption by an average of 22.13% and 29.60% and improve the mAP to 45.8(with 0.9 mAP better), respectively, compared with the SOTA DETR model.