Shuang Gao

CV
14papers
72citations
Novelty40%
AI Score47

14 Papers

CVOct 18, 2022Code
A Real-Time Fusion Framework for Long-term Visual Localization

Yuchen Yang, Xudong Zhang, Shuang Gao et al.

Visual localization is a fundamental task that regresses the 6 Degree Of Freedom (6DoF) poses with image features in order to serve the high precision localization requests in many robotics applications. Degenerate conditions like motion blur, illumination changes and environment variations place great challenges in this task. Fusion with additional information, such as sequential information and Inertial Measurement Unit (IMU) inputs, would greatly assist such problems. In this paper, we present an efficient client-server visual localization architecture that fuses global and local pose estimations to realize promising precision and efficiency. We include additional geometry hints in mapping and global pose regressing modules to improve the measurement quality. A loosely coupled fusion policy is adopted to leverage the computation complexity and accuracy. We conduct the evaluations on two typical open-source benchmarks, 4Seasons and OpenLORIS. Quantitative results prove that our framework has competitive performance with respect to other state-of-the-art visual localization solutions.

LGAug 7, 2022
Transmission Neural Networks: From Virus Spread Models to Neural Networks

Shuang Gao, Peter E. Caines

This work connects models for virus spread on networks with their equivalent neural network representations. Based on this connection, we propose a new neural network architecture, called Transmission Neural Networks (TransNNs) where activation functions are primarily associated with links and are allowed to have different activation levels. Furthermore, this connection leads to the discovery and the derivation of three new activation functions with tunable or trainable parameters. Moreover, we prove that TransNNs with a single hidden layer and a fixed non-zero bias term are universal function approximators. Finally, we present new fundamental derivations of continuous time epidemic network models based on TransNNs.

SYJan 4, 2018
Improving the Closed-Loop Tracking Performance Using the First-Order Hold Sensing Technique with Experiments

Chifu Yang, Shuang Gao, Zhu Xue

This paper proposes a new perspective in the enhancement of the closed-loop tracking performance by using the first-order hold (FOH) sensing technique. Firstly, the literature review and fundamentals of the FOH are outlined. Secondly, the performance of the most commonly used zero-order hold (ZOH) and that of the FOH are compared. Lastly, the detailed implementation of the FOH on a pendulum tracking setup is presented to verify the superiority of the FOH over the ZOH in terms of the steady state tracking error. The results of the simulation and the experiment are in agreement.

CVApr 12, 2023
SGL: Structure Guidance Learning for Camera Localization

Xudong Zhang, Shuang Gao, Xiaohu Nan et al.

Camera localization is a classical computer vision task that serves various Artificial Intelligence and Robotics applications. With the rapid developments of Deep Neural Networks (DNNs), end-to-end visual localization methods are prosperous in recent years. In this work, we focus on the scene coordinate prediction ones and propose a network architecture named as Structure Guidance Learning (SGL) which utilizes the receptive branch and the structure branch to extract both high-level and low-level features to estimate the 3D coordinates. We design a confidence strategy to refine and filter the predicted 3D observations, which enables us to estimate the camera poses by employing the Perspective-n-Point (PnP) with RANSAC. In the training part, we design the Bundle Adjustment trainer to help the network fit the scenes better. Comparisons with some state-of-the-art (SOTA) methods and sufficient ablation experiments confirm the validity of our proposed architecture.

CVFeb 9, 2023
IH-ViT: Vision Transformer-based Integrated Circuit Appear-ance Defect Detection

Xiaoibin Wang, Shuang Gao, Yuntao Zou et al.

For the problems of low recognition rate and slow recognition speed of traditional detection methods in IC appearance defect detection, we propose an IC appearance defect detection algo-rithm IH-ViT. Our proposed model takes advantage of the respective strengths of CNN and ViT to acquire image features from both local and global aspects, and finally fuses the two features for decision making to determine the class of defects, thus obtaining better accuracy of IC defect recognition. To address the problem that IC appearance defects are mainly reflected in the dif-ferences in details, which are difficult to identify by traditional algorithms, we improved the tra-ditional ViT by performing an additional convolution operation inside the batch. For the problem of information imbalance of samples due to diverse sources of data sets, we adopt a dual-channel image segmentation technique to further improve the accuracy of IC appearance defects. Finally, after testing, our proposed hybrid IH-ViT model achieved 72.51% accuracy, which is 2.8% and 6.06% higher than ResNet50 and ViT models alone. The proposed algorithm can quickly and accurately detect the defect status of IC appearance and effectively improve the productivity of IC packaging and testing companies.

52.1OCApr 13
A Decomposition Method for LQ Conditional McKean-Vlasov Control Problems with Random Coefficients

Onésime Hounkpe, Dena Firoozi, Shuang Gao

We propose a decomposition method for solving a general class of linear-quadratic (LQ) McKean-Vlasov control problems involving conditional expectations and random coefficients, where the system dynamics are driven by two independent Wiener processes. Unlike existing approaches in the literature for these problems, such as the extended stochastic maximum principle and the extended dynamic programming methods, which often involve additional technical complexities and sometimes impose restrictive conditions on control inputs, our approach decomposes the original McKean-Vlasov control problem into two decoupled stochastic optimal control problems, one of which has a constrained admissible control set. These auxiliary problems can be solved using classical methods. We establish an equivalence between the well-posedness and solvability of the auxiliary problems and those of the original problem, and show that the sum of the optimal controls of the auxiliary problems yields the optimal control of the original problem. Moreover, by applying a variational method, we characterize the optimal solution to the McKean-Vlasov control problem via two decoupled sets of (non-McKean-Vlasov) linear forward-backward stochastic differential equations, each corresponding to one of the auxiliary problems. Finally, we show that standard dynamic programming can also be applied to solve the resulting auxiliary problems.

60.4SYApr 7
Price-Coordinated Mean Field Games with State Augmentation for Decentralized Battery Charging

Nour Al Dandachly, Shuang Gao, Roland Malhamé

This paper addresses the decentralized coordinated charging problem for a large population of battery storage agents (e.g. residential batteries, electrical vehicles, charging station batteries) using Mean Field Game (MFG). Agents are assumed to have affine dynamics and are coupled through a price that is continuous and monotonically increasing with respect to the difference between the average charging power and the grid's desired average charging power. An important modeling feature of the proposed framework is the state augmentation, that is, the charging power is treated as a state variable and its rate of change (i.e. the ramp rate) as the control input. The resulting MFG equilibrium is characterized by two nonlinearly coupled forward-backward differential equations. The existence and uniqueness of the MFG equilibrium is established for any continuous and monotonically increasing nonlinear price function without additional restrictions on the time horizon. Moreover, in the special case where the price is affine in the average charging power, we further simplify the characterization of the MFG equilibrium strategy via two separate Riccati equations, both of which admit unique positive semi-definite solutions without additional assumptions.

27.9SIApr 5
Transmission Neural Networks: Inhibitory and Excitatory Connections

Shuang Gao, Peter E. Caines

This paper extends the Transmission Neural Network model proposed by Gao and Caines in [1]-[3] to incorporate inhibitory connections and neurotransmitter populations. The extended network model contains binary neuronal states, transmission dynamics, and inhibitory and excitatory connections. Under technical assumptions, we establish the characterization of the firing probabilities of neurons, and show that such a characterization considering inhibitions can be equivalently represented by a neural network where each neuron has a continuous state of dimension 2. Moreover, we incorporated neurotransmitter populations into the modeling and establish the limit network model when the number of neurotransmitters at all synaptic connections go to infinity. Finally, sufficient conditions for stability and contraction properties of the limit network model are established.

2.3SYMar 14
Discrete-time linear quadratic stochastic control with equality-constrained inputs: Application to energy demand response

Leo Seugnet, Shuang Gao

We investigate the discrete-time stochastic linear quadratic control problem for a population of cooperative agents under the hard equality constraint on total control inputs, motivated by demand response in renewable energy systems. We establish the optimal solution that respects hard equality constraints for systems with additive noise in the dynamics. The optimal control law is derived using dynamic programming and Karush-Kuhn-Tucker (KKT) conditions, and the resulting control solution depends on a discrete-time Riccati-like recursive equation. Application examples of coordinating the charging of a network of residential batteries to absorb excess solar power generation are demonstrated, and the proposed control is shown to achieve exact power tracking while considering individual State-of-Charge (SoC) objectives

CVOct 8, 2021
Pose Refinement with Joint Optimization of Visual Points and Lines

Shuang Gao, Jixiang Wan, Yishan Ping et al.

High-precision camera re-localization technology in a pre-established 3D environment map is the basis for many tasks, such as Augmented Reality, Robotics and Autonomous Driving. The point-based visual re-localization approaches are well-developed in recent decades, but are insufficient in some feature-less cases. In this paper, we design a complete pipeline for camera pose refinement with points and lines, which contains the innovatively designed line extracting CNN named VLSE, the line matching and the pose optimization approaches. We adopt a novel line representation and customize a hybrid convolution block based on the Stacked Hourglass network, to detect accurate and stable line features on images. Then we apply a geometric-based strategy to obtain precise 2D-3D line correspondences using epipolar constraint and reprojection filtering. A following point-line joint cost function is constructed to optimize the camera pose with the initial coarse pose from the pure point-based localization. Sufficient experiments are conducted on open datasets, i.e, line extractor on Wireframe and YorkUrban, localization performance on InLoc duc1 and duc2, to confirm the effectiveness of our point-line joint pose optimization method.

CVAug 19, 2021
Retrieval and Localization with Observation Constraints

Yuhao Zhou, Huanhuan Fan, Shuang Gao et al.

Accurate visual re-localization is very critical to many artificial intelligence applications, such as augmented reality, virtual reality, robotics and autonomous driving. To accomplish this task, we propose an integrated visual re-localization method called RLOCS by combining image retrieval, semantic consistency and geometry verification to achieve accurate estimations. The localization pipeline is designed as a coarse-to-fine paradigm. In the retrieval part, we cascade the architecture of ResNet101-GeM-ArcFace and employ DBSCAN followed by spatial verification to obtain a better initial coarse pose. We design a module called observation constraints, which combines geometry information and semantic consistency for filtering outliers. Comprehensive experiments are conducted on open datasets, including retrieval on R-Oxford5k and R-Paris6k, semantic segmentation on Cityscapes, localization on Aachen Day-Night and InLoc. By creatively modifying separate modules in the total pipeline, our method achieves many performance improvements on the challenging localization benchmarks.

LGNov 3, 2020
Self-semi-supervised Learning to Learn from NoisyLabeled Data

Jiacheng Wang, Yue Ma, Shuang Gao

The remarkable success of today's deep neural networks highly depends on a massive number of correctly labeled data. However, it is rather costly to obtain high-quality human-labeled data, leading to the active research area of training models robust to noisy labels. To achieve this goal, on the one hand, many papers have been dedicated to differentiating noisy labels from clean ones to increase the generalization of DNN. On the other hand, the increasingly prevalent methods of self-semi-supervised learning have been proven to benefit the tasks when labels are incomplete. By 'semi' we regard the wrongly labeled data detected as un-labeled data; by 'self' we choose a self-supervised technique to conduct semi-supervised learning. In this project, we designed methods to more accurately differentiate clean and noisy labels and borrowed the wisdom of self-semi-supervised learning to train noisy labeled data.

CVMay 25, 2020
Visual Localization Using Semantic Segmentation and Depth Prediction

Huanhuan Fan, Yuhao Zhou, Ang Li et al.

In this paper, we propose a monocular visual localization pipeline leveraging semantic and depth cues. We apply semantic consistency evaluation to rank the image retrieval results and a practical clustering technique to reject estimation outliers. In addition, we demonstrate a substantial performance boost achieved with a combination of multiple feature extractors. Furthermore, by using depth prediction with a deep neural network, we show that a significant amount of falsely matched keypoints are identified and eliminated. The proposed pipeline outperforms most of the existing approaches at the Long-Term Visual Localization benchmark 2020.

CVSep 10, 2019
VACL: Variance-Aware Cross-Layer Regularization for Pruning Deep Residual Networks

Shuang Gao, Xin Liu, Lung-Sheng Chien et al.

Improving weight sparsity is a common strategy for producing light-weight deep neural networks. However, pruning models with residual learning is more challenging. In this paper, we introduce Variance-Aware Cross-Layer (VACL), a novel approach to address this problem. VACL consists of two parts, a Cross-Layer grouping and a Variance Aware regularization. In Cross-Layer grouping the $i^{th}$ filters of layers connected by skip-connections are grouped into one regularization group. Then, the Variance-Aware regularization term takes into account both the first and second-order statistics of the connected layers to constrain the variance within a group. Our approach can effectively improve the structural sparsity of residual models. For CIFAR10, the proposed method reduces a ResNet model by up to 79.5% with no accuracy drop and reduces a ResNeXt model by up to 82% with less than 1% accuracy drop. For ImageNet, it yields a pruned ratio of up to 63.3% with less than 1% top-5 accuracy drop. Our experimental results show that the proposed approach significantly outperforms other state-of-the-art methods in terms of overall model size and accuracy.