Weibing Zhao

CV
h-index5
6papers
148citations
Novelty51%
AI Score49

6 Papers

LGDec 24, 2025
Temporal Visual Semantics-Induced Human Motion Understanding with Large Language Models

Zheng Xing, Weibing Zhao

Unsupervised human motion segmentation (HMS) can be effectively achieved using subspace clustering techniques. However, traditional methods overlook the role of temporal semantic exploration in HMS. This paper explores the use of temporal vision semantics (TVS) derived from human motion sequences, leveraging the image-to-text capabilities of a large language model (LLM) to enhance subspace clustering performance. The core idea is to extract textual motion information from consecutive frames via LLM and incorporate this learned information into the subspace clustering framework. The primary challenge lies in learning TVS from human motion sequences using LLM and integrating this information into subspace clustering. To address this, we determine whether consecutive frames depict the same motion by querying the LLM and subsequently learn temporal neighboring information based on its response. We then develop a TVS-integrated subspace clustering approach, incorporating subspace embedding with a temporal regularizer that induces each frame to share similar subspace embeddings with its temporal neighbors. Additionally, segmentation is performed based on subspace embedding with a temporal constraint that induces the grouping of each frame with its temporal neighbors. We also introduce a feedback-enabled framework that continuously optimizes subspace embedding based on the segmentation output. Experimental results demonstrate that the proposed method outperforms existing state-of-the-art approaches on four benchmark human motion datasets.

80.6ITMay 11
Survey-Free Radio Map Construction via HMM-Based Coarse-to-Fine Inference

Zheng Xing, Weibing Zhao, Guanghui Zhang et al.

Traditional radio map construction methods mandate labor-intensive data collection and precise location labeling. To address these limitations, we propose a novel survey-free approach for radio map construction that relies solely on unlabeled Received Signal Strength (RSS) measurements, thereby obviating the need for manual site surveys or auxiliary Inertial Measurement Units (IMUs). The key idea involves embedding multiple unlabeled RSS sequences into a known indoor layout, specifically targeting corridor-guided environments with a dominant unidirectional pedestrian flow. However, aligning the embedded coordinates with the RSS collection locations remains challenging due to the random fluctuations inherent in RSS data. To tackle this, we introduce a Hidden Markov Model (HMM)- based Coarse-to-Fine Inference (HCFI) framework. At the coarse level, we employ an HMM-based region label inference algorithm to partition RSS sequences and align the RSS segments with specific physical regions using graph-based inference. At the fine level, we develop an HMM-based location label inference technique to estimate RSS collection coordinates by leveraging RSS propagation principles while incorporating sequential spatio-temporal mobility probability. Empirical results from an office environment demonstrate that the proposed method achieves a radio map construction Mean Absolute Error (MAE) of 8.96 dB. Furthermore, based on the estimated radio map, k-Nearest Neighbor (KNN) localization yields an average positioning error of approximately 3.33 meters, offering a highly viable, survey-free solution for radio map construction under sequential topological assumptions.

87.8ITMay 11
Annotation-Free Indoor Radio Mapping via Physics-Informed Trajectory Inference

Zheng Xing, Mengru Wu, Yi Zhang et al.

Constructing indoor radio maps traditionally requires extensive site surveys with precise user-location labels, making the calibration process costly and time-consuming. Existing calibration-reduction methods either depend on partial location annotations or exploit inertial measurement units (IMUs) to provide relative motion cues; however, IMU-assisted solutions are constrained by hardware availability, device-level access restrictions, and accumulated sensor drift. In this paper, we study a location-label-free indoor radio mapping problem under known access-point deployment geometry and a known walkable spatial domain. We propose a physics-informed trajectory inference framework that uses only Channel State Information (CSI), without relying on user-location labels or IMU measurements. The key idea is to recover the latent spatial coordinates of CSI measurements by exploiting the local spatial continuity of multipath propagation. To this end, we construct a Power-Angle-Delay Profile (PADP) feature distance from MIMO-OFDM CSI and show that, within a local neighborhood and under quasi-static multipath conditions, this distance provides a physically meaningful proxy for small spatial displacements. We then incorporate the PADP-based continuity constraint into a spatially regularized Bayesian inference model for joint trajectory recovery and propagation-parameter estimation. Experiments on a real-world industrial CSI dataset demonstrate that the proposed framework achieves an average localization error of 0.88 m and a relative beam map construction error of 6.68%, improving upon representative channel-embedding and IMU-assisted baselines.

CVApr 30, 2021Code
PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery

Weibing Zhao, Xu Yan, Jiantao Gao et al.

Point Cloud Sampling and Recovery (PCSR) is critical for massive real-time point cloud collection and processing since raw data usually requires large storage and computation. In this paper, we address a fundamental problem in PCSR: How to downsample the dense point cloud with arbitrary scales while preserving the local topology of discarding points in a case-agnostic manner (i.e. without additional storage for point relationship)? We propose a novel Locally Invertible Embedding for point cloud adaptive sampling and recovery (PointLIE). Instead of learning to predict the underlying geometry details in a seemingly plausible manner, PointLIE unifies point cloud sampling and upsampling to one single framework through bi-directional learning. Specifically, PointLIE recursively samples and adjusts neighboring points on each scale. Then it encodes the neighboring offsets of sampled points to a latent space and thus decouples the sampled points and the corresponding local geometric relationship. Once the latent space is determined and that the deep model is optimized, the recovery process could be conducted by passing the recover-pleasing sampled points and a randomly-drawn embedding to the same network through an invertible operation. Such a scheme could guarantee the fidelity of dense point recovery from sampled points. Extensive experiments demonstrate that the proposed PointLIE outperforms state-of-the-arts both quantitatively and qualitatively. Our code is released through https://github.com/zwb0/PointLIE.

LGMar 31, 2024
Block-Diagonal Guided DBSCAN Clustering

Weibing Zhao

Cluster analysis plays a crucial role in database mining, and one of the most widely used algorithms in this field is DBSCAN. However, DBSCAN has several limitations, such as difficulty in handling high-dimensional large-scale data, sensitivity to input parameters, and lack of robustness in producing clustering results. This paper introduces an improved version of DBSCAN that leverages the block-diagonal property of the similarity graph to guide the clustering procedure of DBSCAN. The key idea is to construct a graph that measures the similarity between high-dimensional large-scale data points and has the potential to be transformed into a block-diagonal form through an unknown permutation, followed by a cluster-ordering procedure to generate the desired permutation. The clustering structure can be easily determined by identifying the diagonal blocks in the permuted graph. We propose a gradient descent-based method to solve the proposed problem. Additionally, we develop a DBSCAN-based points traversal algorithm that identifies clusters with high densities in the graph and generates an augmented ordering of clusters. The block-diagonal structure of the graph is then achieved through permutation based on the traversal order, providing a flexible foundation for both automatic and interactive cluster analysis. We introduce a split-and-refine algorithm to automatically search for all diagonal blocks in the permuted graph with theoretically optimal guarantees under specific cases. We extensively evaluate our proposed approach on twelve challenging real-world benchmark clustering datasets and demonstrate its superior performance compared to the state-of-the-art clustering method on every dataset.

CVAug 10, 2021
Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds

Chaoda Zheng, Xu Yan, Jiantao Gao et al.

Current 3D single object tracking approaches track the target based on a feature comparison between the target template and the search area. However, due to the common occlusion in LiDAR scans, it is non-trivial to conduct accurate feature comparisons on severe sparse and incomplete shapes. In this work, we exploit the ground truth bounding box given in the first frame as a strong cue to enhance the feature description of the target object, enabling a more accurate feature comparison in a simple yet effective way. In particular, we first propose the BoxCloud, an informative and robust representation, to depict an object using the point-to-box relation. We further design an efficient box-aware feature fusion module, which leverages the aforementioned BoxCloud for reliable feature matching and embedding. Integrating the proposed general components into an existing model P2B, we construct a superior box-aware tracker (BAT). Experiments confirm that our proposed BAT outperforms the previous state-of-the-art by a large margin on both KITTI and NuScenes benchmarks, achieving a 15.2% improvement in terms of precision while running ~20% faster.