Junyan Yang

CV
h-index1
6papers
103citations
Novelty48%
AI Score41

6 Papers

CVJul 22, 2024
Enhancement of 3D Gaussian Splatting using Raw Mesh for Photorealistic Recreation of Architectures

Ruizhe Wang, Chunliang Hua, Tomakayev Shingys et al.

The photorealistic reconstruction and rendering of architectural scenes have extensive applications in industries such as film, games, and transportation. It also plays an important role in urban planning, architectural design, and the city's promotion, especially in protecting historical and cultural relics. The 3D Gaussian Splatting, due to better performance over NeRF, has become a mainstream technology in 3D reconstruction. Its only input is a set of images but it relies heavily on geometric parameters computed by the SfM process. At the same time, there is an existing abundance of raw 3D models, that could inform the structural perception of certain buildings but cannot be applied. In this paper, we propose a straightforward method to harness these raw 3D models to guide 3D Gaussians in capturing the basic shape of the building and improve the visual quality of textures and details when photos are captured non-systematically. This exploration opens up new possibilities for improving the effectiveness of 3D reconstruction techniques in the field of architectural design.

19.7CYApr 10
Diagnosing Urban Street Vitality via a Visual-Semantic and Spatiotemporal Framework for Street-Level Economics

Xinxin Zhuo, Mengyuan Niu, Ruizhe Wang et al.

Micro-scale street-level economic assessment is fundamental for precision spatial resource allocation. While Street View Imagery (SVI) advances urban sensing, existing approaches remain semantically superficial and overlook brand hierarchy heterogeneity and structural recession. To address this, we propose a visual-semantic and field-based spatiotemporal framework, operationalized via the Street Economic Vitality Index (SEVI). Our approach integrates physical and semantic streetscape parsing through instance segmentation of signboards, glass interfaces, and storefront closures. A dual-stage VLM-LLM pipeline standardizes signage into global hierarchies to quantify a spatially smoothed brand premium index. To overcome static SVI limitations, we introduce a temporal lag design using Location-Based Services (LBS) data to capture realized demand. Combined with a category-weighted Gaussian spillover model, we construct a three-dimensional diagnostic system covering Commercial Activity, Spatial Utilization, and Physical Environment. Experiments based on time-lagged geographically weighted regression across eight tidal periods in Nanjing reveal quasi-causal spatiotemporal heterogeneity. Street vibrancy arises from interactions between hierarchical brand clustering and mall-induced externalities. High-quality interfaces show peak attraction during midday and evening, while structural recession produces a lagged nighttime repulsion effect. The framework offers evidence-based support for precision spatial governance.

CVSep 28, 2025
Controllable Generation of Large-Scale 3D Urban Layouts with Semantic and Structural Guidance

Mengyuan Niu, Xinxin Zhuo, Ruizhe Wang et al.

Urban modeling is essential for city planning, scene synthesis, and gaming. Existing image-based methods generate diverse layouts but often lack geometric continuity and scalability, while graph-based methods capture structural relations yet overlook parcel semantics. We present a controllable framework for large-scale 3D vector urban layout generation, conditioned on both geometry and semantics. By fusing geometric and semantic attributes, introducing edge weights, and embedding building height in the graph, our method extends 2D layouts to realistic 3D structures. It also enables users to directly control the output by modifying semantic attributes. Experiments show that it produces valid, large-scale urban models, offering an effective tool for data-driven planning and design.

CVApr 11, 2025
GeoTexBuild: 3D Building Model Generation from Map Footprints

Ruizhe Wang, Junyan Yang, Qiao Wang

We introduce GeoTexBuild, a modular generative framework for creating 3D building models from footprints derived from site planning or map designs. The system is designed for architects and city planners, offering a seamless solution that directly converts map features into 3D buildings. The proposed framework employs a three-stage process comprising height map generation, geometry reconstruction, and appearance stylization, culminating in building models with detailed geometry and appearance attributes. By integrating customized ControlNet, Neural style field (NSF), and Multi-view diffusion model, we explore effective methods for controlling both geometric and visual attributes during the generation process. Our approach eliminates the problem of structural variations in a single facade image in existing 3D generation techniques for buildings. Experimental results at each stage validate the capability of GeoTexBuild to generate detailed and accurate building models from footprints.

SOC-PHMay 4, 2020
Learning Geo-Contextual Embeddings for Commuting Flow Prediction

Zhicheng Liu, Fabio Miranda, Weiting Xiong et al.

Predicting commuting flows based on infrastructure and land-use information is critical for urban planning and public policy development. However, it is a challenging task given the complex patterns of commuting flows. Conventional models, such as gravity model, are mainly derived from physics principles and limited by their predictive power in real-world scenarios where many factors need to be considered. Meanwhile, most existing machine learning-based methods ignore the spatial correlations and fail to model the influence of nearby regions. To address these issues, we propose Geo-contextual Multitask Embedding Learner (GMEL), a model that captures the spatial correlations from geographic contextual information for commuting flow prediction. Specifically, we first construct a geo-adjacency network containing the geographic contextual information. Then, an attention mechanism is proposed based on the framework of graph attention network (GAT) to capture the spatial correlations and encode geographic contextual information to embedding space. Two separate GATs are used to model supply and demand characteristics. A multitask learning framework is used to introduce stronger restrictions and enhance the effectiveness of the embedding representation. Finally, a gradient boosting machine is trained based on the learned embeddings to predict commuting flows. We evaluate our model using real-world datasets from New York City and the experimental results demonstrate the effectiveness of our proposal against the state of the art.

CVMay 30, 2019
Unsupervised Classification of Street Architectures Based on InfoGAN

Ning Wang, Xianhan Zeng, Renjie Xie et al.

Street architectures play an essential role in city image and streetscape analysing. However, existing approaches are all supervised which require costly labeled data. To solve this, we propose a street architectural unsupervised classification framework based on Information maximizing Generative Adversarial Nets (InfoGAN), in which we utilize the auxiliary distribution $Q$ of InfoGAN as an unsupervised classifier. Experiments on database of true street view images in Nanjing, China validate the practicality and accuracy of our framework. Furthermore, we draw a series of heuristic conclusions from the intrinsic information hidden in true images. These conclusions will assist planners to know the architectural categories better.