Yongtian Wang

h-index21

8papers

389citations

Novelty44%

AI Score36

Ranked #121,481 of 201,326 authors (top 60%)#38,677 in CV (top 66%)

8 Papers

LGAug 15, 2022

Combining deep learning and crowdsourcing geo-images to predict housing quality in rural China

Weipan Xu, Yu Gu, Yifan Chen et al.

Housing quality is an essential proxy for regional wealth, security and health. Understanding the distribution of housing quality is crucial for unveiling rural development status and providing political proposals. However,present rural house quality data highly depends on a top-down, time-consuming survey at the national or provincial level but fails to unpack the housing quality at the village level. To fill the gap between accurately depicting rural housing quality conditions and deficient data,we collect massive rural images and invite users to assess their housing quality at scale. Furthermore, a deep learning framework is proposed to automatically and efficiently predict housing quality based on crowd-sourcing rural images.

CVAug 24, 2025

Robust Point Cloud Registration via Geometric Overlapping Guided Rotation Search

Zhao Zheng, Jingfan Fan, Long Shao et al.

Point cloud registration based on correspondences computes the rigid transformation that maximizes the number of inliers constrained within the noise threshold. Current state-of-the-art (SOTA) methods employing spatial compatibility graphs or branch-and-bound (BnB) search mainly focus on registration under high outlier ratios. However, graph-based methods require at least quadratic space and time complexity for graph construction, while multi-stage BnB search methods often suffer from inaccuracy due to local optima between decomposed stages. This paper proposes a geometric maximum overlapping registration framework via rotation-only BnB search. The rigid transformation is decomposed using Chasles' theorem into a translation along rotation axis and a 2D rigid transformation. The optimal rotation axis and angle are searched via BnB, with residual parameters formulated as range maximum query (RMQ) problems. Firstly, the top-k candidate rotation axes are searched within a hemisphere parameterized by cube mapping, and the translation along each axis is estimated through interval stabbing of the correspondences projected onto that axis. Secondly, the 2D registration is relaxed to 1D rotation angle search with 2D RMQ of geometric overlapping for axis-aligned rectangles, which is solved deterministically in polynomial time using sweep line algorithm with segment tree. Experimental results on 3DMatch, 3DLoMatch, and KITTI datasets demonstrate superior accuracy and efficiency over SOTA methods, while the time complexity is polynomial and the space complexity increases linearly with the number of points, even in the worst case.

OPTICSOct 7, 2021

Optical secret sharing with cascaded metasurface holography

Philip Georgi, Qunshuo Wei, Basudeb Sain et al.

Secret sharing is a well-established cryptographic primitive for storing highly sensitive information like encryption keys for encoded data. It describes the problem of splitting a secret into different shares, without revealing any information about the secret to its shareholders. Here, we demonstrate an all-optical solution for secret sharing based on metasurface holography. In our concept, metasurface holograms are used as spatially separable shares that carry an encrypted message in form of a holographic image. Two of these shares can be recombined by bringing them close together. Light passing through this stack of metasurfaces accumulates the phase shift of both holograms and can optically reconstruct the secret with high fidelity. On the other hand, the holograms generated by the single metasurfaces can be used for identifying each shareholder. Furthermore, we demonstrate that the inherent translational alignment sensitivity between the two stacked metasurface holograms can be used for spatial multiplexing, which can be further extended to realize optical rulers.

CVApr 6, 2019

Deep Surface Normal Estimation with Hierarchical RGB-D Fusion

Jin Zeng, Yanfeng Tong, Yunmu Huang et al.

The growing availability of commodity RGB-D cameras has boosted the applications in the field of scene understanding. However, as a fundamental scene understanding task, surface normal estimation from RGB-D data lacks thorough investigation. In this paper, a hierarchical fusion network with adaptive feature re-weighting is proposed for surface normal estimation from a single RGB-D image. Specifically, the features from color image and depth are successively integrated at multiple scales to ensure global surface smoothness while preserving visually salient details. Meanwhile, the depth features are re-weighted with a confidence map estimated from depth before merging into the color branch to avoid artifacts caused by input depth corruption. Additionally, a hybrid multi-scale loss function is designed to learn accurate normal estimation given noisy ground-truth dataset. Extensive experimental results validate the effectiveness of the fusion strategy and the loss design, outperforming state-of-the-art normal estimation schemes.

HCMar 7, 2019

Symmetrical Reality: Toward a Unified Framework for Physical and Virtual Reality

Zhenliang Zhang, Cong Wang, Dongdong Weng et al.

In this paper, we review the background of physical reality, virtual reality, and some traditional mixed forms of them. Based on the current knowledge, we propose a new unified concept called symmetrical reality to describe the physical and virtual world in a unified perspective. Under the framework of symmetrical reality, the traditional virtual reality, augmented reality, inverse virtual reality, and inverse augmented reality can be interpreted using a unified presentation. We analyze the characteristics of symmetrical reality from two different observation locations (i.e., from the physical world and from the virtual world), where all other forms of physical and virtual reality can be treated as special cases of symmetrical reality.

CVFeb 17, 2019

Exploring Stereovision-Based 3-D Scene Reconstruction for Augmented Reality

Guang-Yu Nie, Yun Liu, Cong Wang et al.

Three-dimensional (3-D) scene reconstruction is one of the key techniques in Augmented Reality (AR), which is related to the integration of image processing and display systems of complex information. Stereo matching is a computer vision based approach for 3-D scene reconstruction. In this paper, we explore an improved stereo matching network, SLED-Net, in which a Single Long Encoder-Decoder is proposed to replace the stacked hourglass network in PSM-Net for better contextual information learning. We compare SLED-Net to state-of-the-art methods recently published, and demonstrate its superior performance on Scene Flow and KITTI2015 test sets.

HCAug 10, 2018

Inverse Augmented Reality: A Virtual Agent's Perspective

Zhenliang Zhang, Dongdong Weng, Haiyan Jiang et al.

We propose a framework called inverse augmented reality (IAR) which describes the scenario that a virtual agent living in the virtual world can observe both virtual objects and real objects. This is different from the traditional augmented reality. The traditional virtual reality, mixed reality and augmented reality are all generated for humans, i.e., they are human-centered frameworks. On the contrary, the proposed inverse augmented reality is a virtual agent-centered framework, which represents and analyzes the reality from a virtual agent's perspective. In this paper, we elaborate the framework of inverse augmented reality to argue the equivalence of the virtual world and the physical world regarding the whole physical structure.

CVMay 25, 2018

Greedy Graph Searching for Vascular Tracking in Angiographic Image Sequences

Huihui Fang, Jian Yang, Jianjun Zhu et al.

Vascular tracking of angiographic image sequences is one of the most clinically important tasks in the diagnostic assessment and interventional guidance of cardiac disease. However, this task can be challenging to accomplish because of unsatisfactory angiography image quality and complex vascular structures. Thus, this study proposed a new greedy graph search-based method for vascular tracking. Each vascular branch is separated from the vasculature and is tracked independently. Then, all branches are combined using topology optimization, thereby resulting in complete vasculature tracking. A gray-based image registration method was applied to determine the tracking range, and the deformation field between two consecutive frames was calculated. The vascular branch was described using a vascular centerline extraction method with multi-probability fusion-based topology optimization. We introduce an undirected acyclic graph establishment technique. A greedy search method was proposed to acquire all possible paths in the graph that might match the tracked vascular branch. The final tracking result was selected by branch matching using dynamic time warping with a DAISY descriptor. The solution to the problem reflected both the spatial and textural information between successive frames. Experimental results demonstrated that the proposed method was effective and robust for vascular tracking, attaining a F1 score of 0.89 on a single branch dataset and 0.88 on a vessel tree dataset. This approach provided a universal solution to address the problem of filamentary structure tracking.