Ping Guo

LG
h-index46
11papers
73citations
Novelty45%
AI Score30

11 Papers

4.6LGAug 3, 2022Code
EgPDE-Net: Building Continuous Neural Networks for Time Series Prediction with Exogenous Variables

Penglei Gao, Xi Yang, Rui Zhang et al.

While exogenous variables have a major impact on performance improvement in time series analysis, inter-series correlation and time dependence among them are rarely considered in the present continuous methods. The dynamical systems of multivariate time series could be modelled with complex unknown partial differential equations (PDEs) which play a prominent role in many disciplines of science and engineering. In this paper, we propose a continuous-time model for arbitrary-step prediction to learn an unknown PDE system in multivariate time series whose governing equations are parameterised by self-attention and gated recurrent neural networks. The proposed model, \underline{E}xogenous-\underline{g}uided \underline{P}artial \underline{D}ifferential \underline{E}quation Network (EgPDE-Net), takes account of the relationships among the exogenous variables and their effects on the target series. Importantly, the model can be reduced into a regularised ordinary differential equation (ODE) problem with special designed regularisation guidance, which makes the PDE problem tractable to obtain numerical solutions and feasible to predict multiple future values of the target series at arbitrary time points. Extensive experiments demonstrate that our proposed model could achieve competitive accuracy over strong baselines: on average, it outperforms the best baseline by reducing $9.85\%$ on RMSE and $13.98\%$ on MAE for arbitrary-step prediction.

2.3CVDec 1, 2020Code
RaP-Net: A Region-wise and Point-wise Weighting Network to Extract Robust Features for Indoor Localization

Dongjiang Li, Jinyu Miao, Xuesong Shi et al.

Feature extraction plays an important role in visual localization. Unreliable features on dynamic objects or repetitive regions will interfere with feature matching and challenge indoor localization greatly. To address the problem, we propose a novel network, RaP-Net, to simultaneously predict region-wise invariability and point-wise reliability, and then extract features by considering both of them. We also introduce a new dataset, named OpenLORIS-Location, to train the proposed network. The dataset contains 1553 images from 93 indoor locations. Various appearance changes between images of the same location are included and can help the model to learn the invariability in typical indoor scenes. Experimental results show that the proposed RaP-Net trained with OpenLORIS-Location dataset achieves excellent performance in the feature matching task and significantly outperforms state-of-the-arts feature algorithms in indoor localization. The RaP-Net code and dataset are available at https://github.com/ivipsourcecode/RaP-Net.

13.1CVDec 4, 2023Code
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos

Min Yang, Huan Gao, Ping Guo et al.

Vision Transformer (ViT) has shown high potential in video recognition, owing to its flexible design, adaptable self-attention mechanisms, and the efficacy of masked pre-training. Yet, it remains unclear how to adapt these pre-trained short-term ViTs for temporal action detection (TAD) in untrimmed videos. The existing works treat them as off-the-shelf feature extractors for each short-trimmed snippet without capturing the fine-grained relation among different snippets in a broader temporal context. To mitigate this issue, this paper focuses on designing a new mechanism for adapting these pre-trained ViT models as a unified long-form video transformer to fully unleash its modeling power in capturing inter-snippet relation, while still keeping low computation overhead and memory consumption for efficient TAD. To this end, we design effective cross-snippet propagation modules to gradually exchange short-term video information among different snippets from two levels. For inner-backbone information propagation, we introduce a cross-snippet propagation strategy to enable multi-snippet temporal feature interaction inside the backbone.For post-backbone information propagation, we propose temporal transformer layers for further clip-level modeling. With the plain ViT-B pre-trained with VideoMAE, our end-to-end temporal action detector (ViT-TAD) yields a very competitive performance to previous temporal action detectors, riching up to 69.5 average mAP on THUMOS14, 37.40 average mAP on ActivityNet-1.3 and 17.20 average mAP on FineAction.

0.3CLFeb 18, 2022Code
CLSEG: Contrastive Learning of Story Ending Generation

Yuqiang Xie, Yue Hu, Luxi Xing et al.

Story Ending Generation (SEG) is a challenging task in natural language generation. Recently, methods based on Pre-trained Language Models (PLM) have achieved great prosperity, which can produce fluent and coherent story endings. However, the pre-training objective of PLM-based methods is unable to model the consistency between story context and ending. The goal of this paper is to adopt contrastive learning to generate endings more consistent with story context, while there are two main challenges in contrastive learning of SEG. First is the negative sampling of wrong endings inconsistent with story contexts. The second challenge is the adaptation of contrastive learning for SEG. To address these two issues, we propose a novel Contrastive Learning framework for Story Ending Generation (CLSEG), which has two steps: multi-aspect sampling and story-specific contrastive learning. Particularly, for the first issue, we utilize novel multi-aspect sampling mechanisms to obtain wrong endings considering the consistency of order, causality, and sentiment. To solve the second issue, we well-design a story-specific contrastive training strategy that is adapted for SEG. Experiments show that CLSEG outperforms baselines and can produce story endings with stronger consistency and rationality.

5.5LGMar 10, 2021
Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems

Ping Guo, Kaizhu Huang, Zenglin Xu

In this work, we generalize the reaction-diffusion equation in statistical physics, Schrödinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research. We take finite difference method to discretize NPDE for finding numerical solution, and the basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated. The learning strategies, such as Adaptive moment estimation, L-BFGS, pseudoinverse learning algorithms and partial differential equation constrained optimization, are also presented. We believe it is of significance that presented clear physical image of interpretable deep neural networks, which makes it be possible for applying to analog computing device design, and pave the road to physical artificial intelligence.

7.5NEMay 31, 2020
Synergetic Learning Systems: Concept, Architecture, and Algorithms

Ping Guo, Qian Yin

Drawing on the idea that brain development is a Darwinian process of ``evolution + selection'' and the idea that the current state is a local equilibrium state of many bodies with self-organization and evolution processes driven by the temperature and gravity in our universe, in this work, we describe an artificial intelligence system called the ``Synergetic Learning Systems''. The system is composed of two or more subsystems (models, agents or virtual bodies), and it is an open complex giant system. Inspired by natural intelligence, the system achieves intelligent information processing and decision-making in a given environment through cooperative/competitive synergetic learning. The intelligence evolved by the natural law of ``it is not the strongest of the species that survives, but the one most responsive to change,'' while an artificial intelligence system should adopt the law of ``human selection'' in the evolution process. Therefore, we expect that the proposed system architecture can also be adapted in human-machine synergy or multi-agent synergetic systems. It is also expected that under our design criteria, the proposed system will eventually achieve artificial general intelligence through long term coevolution.

2.3IMFeb 16, 2020
Two-dimensional Multi-fiber Spectrum Image Correction Based on Machine Learning Techniques

Jiali Xu, Qian Yin, Ping Guo et al.

Due to limited size and imperfect of the optical components in a spectrometer, aberration has inevitably been brought into two-dimensional multi-fiber spectrum image in LAMOST, which leads to obvious spacial variation of the point spread functions (PSFs). Consequently, if spatial variant PSFs are estimated directly , the huge storage and intensive computation requirements result in deconvolutional spectral extraction method become intractable. In this paper, we proposed a novel method to solve the problem of spatial variation PSF through image aberration correction. When CCD image aberration is corrected, PSF, the convolution kernel, can be approximated by one spatial invariant PSF only. Specifically, machine learning techniques are adopted to calibrate distorted spectral image, including Total Least Squares (TLS) algorithm, intelligent sampling method, multi-layer feed-forward neural networks. The calibration experiments on the LAMOST CCD images show that the calibration effect of proposed method is effectible. At the same time, the spectrum extraction results before and after calibration are compared, results show the characteristics of the extracted one-dimensional waveform are more close to an ideal optics system, and the PSF of the corrected object spectrum image estimated by the blind deconvolution method is nearly central symmetry, which indicates that our proposed method can significantly reduce the complexity of spectrum extraction and improve extraction accuracy.

4.1LGNov 5, 2018
PILAE: A Non-gradient Descent Learning Scheme for Deep Feedforward Neural Networks

P. Guo, K. Wang, X. L. Zhou

In this work, a non-gradient descent learning (NGDL) scheme was proposed for deep feedforward neural networks (DNN). It is known that an autoencoder can be used as the building blocks of the multi-layer perceptron (MLP) DNN, the MLP is taken as an example to illustrate the proposed scheme of pseudoinverse learning algorithm for autoencoder (PILAE) in this paper. The PILAE with low rank approximation is a NGDL algorithm, and the encoder weight matrix is set to be the low rank approximation of the pseudoinverse of the input matrix, while the decoder weight matrix is calculated by the pseudoinverse learning algorithm. It is worth to note that only very few network structure hyper-parameters need to be tuned compared with classical gradient descent learning algorithm. Hence, the proposed algorithm could be regarded as a quasi-automated training algorithm which could be utilized in automated machine learning field. The experimental results show that the proposed learning scheme for DNN could achieve better performance on considering the tradeoff between training efficiency and classification accuracy.

3.5LGMay 20, 2018
A VEST of the Pseudoinverse Learning Algorithm

Ping Guo

In this paper, we briefly review the basic scheme of the pseudoinverse learning (PIL) algorithm and present some discussions on the PIL, as well as its variants. The PIL algorithm, first presented in 1995, is a non-gradient descent and non-iterative learning algorithm for multi-layer neural networks and has several advantages compared with gradient descent based algorithms. Some new viewpoints to PIL algorithm are presented, and several common pitfalls in practical implementation of the neural network learning task are also addressed. In addition, we show that so called extreme learning machine is a Variant crEated by Simple name alTernation (VEST) of the PIL algorithm for single hidden layer feedforward neural networks.

3.3IMNov 27, 2017
Pulsar Candidate Identification with Artificial Intelligence Techniques

Ping Guo, Fuqing Duan, Pei Wang et al.

Discovering pulsars is a significant and meaningful research topic in the field of radio astronomy. With the advent of astronomical instruments such as he Five-hundred-meter Aperture Spherical Telescope (FAST) in China, data volumes and data rates are exponentially growing. This fact necessitates a focus on artificial intelligence (AI) technologies that can perform the automatic pulsar candidate identification to mine large astronomical data sets. Automatic pulsar candidate identification can be considered as a task of determining potential candidates for further investigation and eliminating noises of radio frequency interferences or other non-pulsar signals. It is very hard to raise the performance of DCNN-based pulsar identification because the limited training samples restrict network structure to be designed deep enough for learning good features as well as the crucial class imbalance problem due to very limited number of real pulsar samples. To address these problems, we proposed a framework which combines deep convolution generative adversarial network (DCGAN) with support vector machine (SVM) to deal with imbalance class problem and to improve pulsar identification accuracy. DCGAN is used as sample generation and feature learning model, and SVM is adopted as the classifier for predicting candidate's labels in the inference stage. The proposed framework is a novel technique which not only can solve imbalance class problem but also can learn discriminative feature representations of pulsar candidates instead of computing hand-crafted features in preprocessing steps too, which makes it more accurate for automatic pulsar candidate selection. Experiments on two pulsar datasets verify the effectiveness and efficiency of our proposed method.

1.5CVOct 1, 2012
Combined Descriptors in Spatial Pyramid Domain for Image Classification

Junlin Hu, Ping Guo

Recently spatial pyramid matching (SPM) with scale invariant feature transform (SIFT) descriptor has been successfully used in image classification. Unfortunately, the codebook generation and feature quantization procedures using SIFT feature have the high complexity both in time and space. To address this problem, in this paper, we propose an approach which combines local binary patterns (LBP) and three-patch local binary patterns (TPLBP) in spatial pyramid domain. The proposed method does not need to learn the codebook and feature quantization processing, hence it becomes very efficient. Experiments on two popular benchmark datasets demonstrate that the proposed method always significantly outperforms the very popular SPM based SIFT descriptor method both in time and classification accuracy.