Zhiling Guo

CV
5papers
65citations
Novelty19%
AI Score18

5 Papers

CVJul 9, 2023
Enhancing Building Semantic Segmentation Accuracy with Super Resolution and Deep Learning: Investigating the Impact of Spatial Resolution on Various Datasets

Zhiling Guo, Xiaodan Shi, Haoran Zhang et al.

The development of remote sensing and deep learning techniques has enabled building semantic segmentation with high accuracy and efficiency. Despite their success in different tasks, the discussions on the impact of spatial resolution on deep learning based building semantic segmentation are quite inadequate, which makes choosing a higher cost-effective data source a big challenge. To address the issue mentioned above, in this study, we create remote sensing images among three study areas into multiple spatial resolutions by super-resolution and down-sampling. After that, two representative deep learning architectures: UNet and FPN, are selected for model training and testing. The experimental results obtained from three cities with two deep learning models indicate that the spatial resolution greatly influences building segmentation results, and with a better cost-effectiveness around 0.3m, which we believe will be an important insight for data selection and preparation.

CVJun 24, 2023
Real-World Video for Zoom Enhancement based on Spatio-Temporal Coupling

Zhiling Guo, Yinqiang Zheng, Haoran Zhang et al.

In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs. In this paper, we further investigate the feasibility of applying realistic multi-frame clips to enhance zoom quality via spatio-temporal information coupling. Specifically, we first built a real-world video benchmark, VideoRAW, by a synchronized co-axis optical system. The dataset contains paired short-focus raw and long-focus sRGB videos of different dynamic scenes. Based on VideoRAW, we then presented a Spatio-Temporal Coupling Loss, termed as STCL. The proposed STCL is intended for better utilization of information from paired and adjacent frames to align and fuse features both temporally and spatially at the feature level. The outperformed experimental results obtained in different zoom scenarios demonstrate the superiority of integrating real-world video dataset and STCL into existing SR models for zoom quality enhancement, and reveal that the proposed method can serve as an advanced and viable tool for video zoom.

LGSep 28, 2018
Semantic Segmentation for Urban Planning Maps based on U-Net

Zhiling Guo, Hiroaki Shengoku, Guangming Wu et al.

The automatic digitizing of paper maps is a significant and challenging task for both academia and industry. As an important procedure of map digitizing, the semantic segmentation section mainly relies on manual visual interpretation with low efficiency. In this study, we select urban planning maps as a representative sample and investigate the feasibility of utilizing U-shape fully convolutional based architecture to perform end-to-end map semantic segmentation. The experimental results obtained from the test area in Shibuya district, Tokyo, demonstrate that our proposed method could achieve a very high Jaccard similarity coefficient of 93.63% and an overall accuracy of 99.36%. For implementation on GPGPU and cuDNN, the required processing time for the whole Shibuya district can be less than three minutes. The results indicate the proposed method can serve as a viable tool for urban planning map semantic segmentation task with high accuracy and efficiency.

CVSep 10, 2018
Geoseg: A Computer Vision Package for Automatic Building Segmentation and Outline Extraction

Guangming Wu, Zhiling Guo

Recently, deep learning algorithms, especially fully convolutional network based methods, are becoming very popular in the field of remote sensing. However, these methods are implemented and evaluated through various datasets and deep learning frameworks. There has not been a package that covers these methods in a unifying manner. In this study, we introduce a computer vision package termed Geoseg that focus on building segmentation and outline extraction. Geoseg implements over nine state-of-the-art models as well as utility scripts needed to conduct model training, logging, evaluating and visualization. The implementation of Geoseg emphasizes unification, simplicity, and flexibility. The performance and computational efficiency of all implemented methods are evaluated by comparison experiment through a unified, high-quality aerial image dataset.