Fuqiang Gu

CV
h-index4
8papers
215citations
Novelty44%
AI Score39

8 Papers

CLJul 4, 2022
Location reference recognition from texts: A survey and comparison

Xuke Hu, Zhiyong Zhou, Hao Li et al.

A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to the process of recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of the specific applications is still missing. Further, there lacks a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and a core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching-based, statistical learning-based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references across the world. Results from this thorough evaluation can help inform future methodological developments for location reference recognition, and can help guide the selection of proper approaches based on application needs.

CVDec 17, 2024Code
SLTNet: Efficient Event-based Semantic Segmentation with Spike-driven Lightweight Transformer-based Networks

Xianlei Long, Xiaxin Zhu, Fangming Guo et al.

Event-based semantic segmentation has great potential in autonomous driving and robotics due to the advantages of event cameras, such as high dynamic range, low latency, and low power cost. Unfortunately, current artificial neural network (ANN)-based segmentation methods suffer from high computational demands, the requirements for image frames, and massive energy consumption, limiting their efficiency and application on resource-constrained edge/mobile platforms. To address these problems, we introduce SLTNet, a spike-driven lightweight transformer-based network designed for event-based semantic segmentation. Specifically, SLTNet is built on efficient spike-driven convolution blocks (SCBs) to extract rich semantic features while reducing the model's parameters. Then, to enhance the long-range contextural feature interaction, we propose novel spike-driven transformer blocks (STBs) with binary mask operations. Based on these basic blocks, SLTNet employs a high-efficiency single-branch architecture while maintaining the low energy consumption of the Spiking Neural Network (SNN). Finally, extensive experiments on DDD17 and DSEC-Semantic datasets demonstrate that SLTNet outperforms state-of-the-art (SOTA) SNN-based methods by at most 9.06% and 9.39% mIoU, respectively, with extremely 4.58x lower energy consumption and 114 FPS inference speed. Our code is open-sourced and available at https://github.com/longxianlei/SLTNet-v1.0.

IVDec 10, 2021Code
Surrogate-based cross-correlation for particle image velocimetry

Yong Lee, Fuqiang Gu, Zeyu Gong et al.

This paper presents a novel surrogate-based cross-correlation (SBCC) framework to improve the correlation performance for practical particle image velocimetry~(PIV). The basic idea is that an optimized surrogate filter/image, replacing one raw image, will produce a more accurate and robust correlation signal. Specifically, the surrogate image is encouraged to generate perfect Gaussian-shaped correlation map to tracking particles (PIV image pair) while producing zero responses to image noise (context images). And the problem is formularized with an objective function composed of surrogate loss and consistency loss. As a result, the closed-form solution provides an efficient multivariate operator that could consider other negative context images. Compared with the state-of-the-art baseline methods (background subtraction, robust phase correlation, etc.), our SBCC method exhibits significant performance improvement (accuracy and robustness) on the synthetic dataset and several challenging experimental PIV cases. Besides, our implementation with experimental details (\url{https://github.com/yongleex/SBCC}) is also available for interested researchers.

CVDec 30, 2025
MambaSeg: Harnessing Mamba for Accurate and Efficient Image-Event Semantic Segmentation

Fuqiang Gu, Yuanke Li, Xianlei Long et al.

Semantic segmentation is a fundamental task in computer vision with wide-ranging applications, including autonomous driving and robotics. While RGB-based methods have achieved strong performance with CNNs and Transformers, their effectiveness degrades under fast motion, low-light, or high dynamic range conditions due to limitations of frame cameras. Event cameras offer complementary advantages such as high temporal resolution and low latency, yet lack color and texture, making them insufficient on their own. To address this, recent research has explored multimodal fusion of RGB and event data; however, many existing approaches are computationally expensive and focus primarily on spatial fusion, neglecting the temporal dynamics inherent in event streams. In this work, we propose MambaSeg, a novel dual-branch semantic segmentation framework that employs parallel Mamba encoders to efficiently model RGB images and event streams. To reduce cross-modal ambiguity, we introduce the Dual-Dimensional Interaction Module (DDIM), comprising a Cross-Spatial Interaction Module (CSIM) and a Cross-Temporal Interaction Module (CTIM), which jointly perform fine-grained fusion along both spatial and temporal dimensions. This design improves cross-modal alignment, reduces ambiguity, and leverages the complementary properties of each modality. Extensive experiments on the DDD17 and DSEC datasets demonstrate that MambaSeg achieves state-of-the-art segmentation performance while significantly reducing computational cost, showcasing its promise for efficient, scalable, and robust multimodal perception.

CVMay 7, 2024
A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Xianlei Long, Hui Zhao, Chao Chen et al.

In recent years, wide-area visual surveillance systems have been widely applied in various industrial and transportation scenarios. These systems, however, face significant challenges when implementing multi-object detection due to conflicts arising from the need for high-resolution imaging, efficient object searching, and accurate localization. To address these challenges, this paper presents a hybrid system that incorporates a wide-angle camera, a high-speed search camera, and a galvano-mirror. In this system, the wide-angle camera offers panoramic images as prior information, which helps the search camera capture detailed images of the targeted objects. This integrated approach enhances the overall efficiency and effectiveness of wide-area visual detection systems. Specifically, in this study, we introduce a wide-angle camera-based method to generate a panoramic probability map (PPM) for estimating high-probability regions of target object presence. Then, we propose a probability searching module that uses the PPM-generated prior information to dynamically adjust the sampling range and refine target coordinates based on uncertainty variance computed by the object detector. Finally, the integration of PPM and the probability searching module yields an efficient hybrid vision system capable of achieving 120 fps multi-object search and detection. Extensive experiments are conducted to verify the system's effectiveness and robustness.

LGJun 7, 2021
EventDrop: data augmentation for event-based learning

Fuqiang Gu, Weicong Sng, Xuke Hu et al.

The advantages of event-sensing over conventional sensors (e.g., higher dynamic range, lower time latency, and lower power consumption) have spurred research into machine learning for event data. Unsurprisingly, deep learning has emerged as a competitive methodology for learning with event sensors; in typical setups, discrete and asynchronous events are first converted into frame-like tensors on which standard deep networks can be applied. However, over-fitting remains a challenge, particularly since event datasets remain small relative to conventional datasets (e.g., ImageNet). In this paper, we introduce EventDrop, a new method for augmenting asynchronous event data to improve the generalization of deep models. By dropping events selected with various strategies, we are able to increase the diversity of training data (e.g., to simulate various levels of occlusion). From a practical perspective, EventDrop is simple to implement and computationally low-cost. Experiments on two event datasets (N-Caltech101 and N-Cars) demonstrate that EventDrop can significantly improve the generalization performance across a variety of deep networks.

NIAug 1, 2020
Fast and Reliable WiFi Fingerprint Collection for Indoor Localization

Fuqiang Gu, Milad Ramezani, Kourosh Khoshelham et al.

Fingerprinting is a popular indoor localization technique since it can utilize existing infrastructures (e.g., access points). However, its site survey process is a labor-intensive and time-consuming task, which limits the application of such systems in practice. In this paper, motivated by the availability of advanced sensing capabilities in smartphones, we propose a fast and reliable fingerprint collection method to reduce the time and labor required for site survey. The proposed method uses a landmark graph-based method to automatically associate the collected fingerprints, which does not require active user participation. We will show that besides fast fingerprint data collection, the proposed method results in accurate location estimate compared to the state-of-the-art methods. Experimental results show that the proposed method is an order of magnitude faster than the manual fingerprint collection method, and using the radio map generated by our method achieves a much better accuracy compared to the existing methods.

SPAug 1, 2020
TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition

Fuqiang Gu, Weicong Sng, Tasbolat Taunyazov et al.

Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. However, these unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. In this paper, we propose a novel spiking graph neural network for event-based tactile object recognition. To make use of local connectivity of taxels, we present several methods for organizing the tactile data in a graph structure. Based on the constructed graphs, we develop a spiking graph convolutional network. The event-driven nature of spiking neural network makes it arguably more suitable for processing the event-based data. Experimental results on two tactile datasets show that the proposed method outperforms other state-of-the-art spiking methods, achieving high accuracies of approximately 90\% when classifying a variety of different household objects.