Qiyu Kang

LG
h-index98
29papers
527citations
Novelty51%
AI Score61

29 Papers

CVApr 3, 2023Code
HypLiLoc: Towards Effective LiDAR Pose Regression with Hyperbolic Fusion

Sijie Wang, Qiyu Kang, Rui She et al.

LiDAR relocalization plays a crucial role in many fields, including robotics, autonomous driving, and computer vision. LiDAR-based retrieval from a database typically incurs high computation storage costs and can lead to globally inaccurate pose estimations if the database is too sparse. On the other hand, pose regression methods take images or point clouds as inputs and directly regress global poses in an end-to-end manner. They do not perform database matching and are more computationally efficient than retrieval techniques. We propose HypLiLoc, a new model for LiDAR pose regression. We use two branched backbones to extract 3D features and 2D projection features, respectively. We consider multi-modal feature fusion in both Euclidean and hyperbolic spaces to obtain more effective feature representations. Experimental results indicate that HypLiLoc achieves state-of-the-art performance in both outdoor and indoor datasets. We also conduct extensive ablation studies on the framework design, which demonstrate the effectiveness of multi-modal feature extraction and multi-space embedding. Our code is released at: https://github.com/sijieaaa/HypLiLoc

CVNov 21, 2022Code
RobustLoc: Robust Camera Pose Regression in Challenging Driving Environments

Sijie Wang, Qiyu Kang, Rui She et al.

Camera relocalization has various applications in autonomous driving. Previous camera pose regression models consider only ideal scenarios where there is little environmental perturbation. To deal with challenging driving environments that may have changing seasons, weather, illumination, and the presence of unstable objects, we propose RobustLoc, which derives its robustness against perturbations from neural differential equations. Our model uses a convolutional neural network to extract feature maps from multi-view images, a robust neural differential equation diffusion block module to diffuse information interactively, and a branched pose decoder with multi-layer training to estimate the vehicle poses. Experiments demonstrate that RobustLoc surpasses current state-of-the-art camera pose regression models and achieves robust performance in various environments. Our code is released at: https://github.com/sijieaaa/RobustLoc

LGOct 10, 2023Code
Adversarial Robustness in Graph Neural Networks: A Hamiltonian Approach

Kai Zhao, Qiyu Kang, Yang Song et al.

Graph neural networks (GNNs) are vulnerable to adversarial perturbations, including those that affect both node features and graph topology. This paper investigates GNNs derived from diverse neural flows, concentrating on their connection to various stability notions such as BIBO stability, Lyapunov stability, structural stability, and conservative stability. We argue that Lyapunov stability, despite its common use, does not necessarily ensure adversarial robustness. Inspired by physics principles, we advocate for the use of conservative Hamiltonian neural flows to construct GNNs that are robust to adversarial attacks. The adversarial robustness of different neural flow GNNs is empirically compared on several benchmark datasets under a variety of adversarial attacks. Extensive numerical experiments demonstrate that GNNs leveraging conservative Hamiltonian flows with Lyapunov stability substantially improve robustness against adversarial perturbations. The implementation code of experiments is available at https://github.com/zknus/NeurIPS-2023-HANG-Robustness.

CVMay 12, 2022Code
Building Facade Parsing R-CNN

Sijie Wang, Qiyu Kang, Rui She et al.

Building facade parsing, which predicts pixel-level labels for building facades, has applications in computer vision perception for autonomous vehicle (AV) driving. However, instead of a frontal view, an on-board camera of an AV captures a deformed view of the facade of the buildings on both sides of the road the AV is travelling on, due to the camera perspective. We propose Facade R-CNN, which includes a transconv module, generalized bounding box detection, and convex regularization, to perform parsing of deformed facade views. Experiments demonstrate that Facade R-CNN achieves better performance than the current state-of-the-art facade parsing models, which are primarily developed for frontal views. We also publish a new building facade parsing dataset derived from the Oxford RobotCar dataset, which we call the Oxford RobotCar Facade dataset. This dataset contains 500 street-view images from the Oxford RobotCar dataset augmented with accurate annotations of building facade objects. The published dataset is available at https://github.com/sijieaaa/Oxford-RobotCar-Facade

LGSep 16, 2022
On the Robustness of Graph Neural Diffusion to Topology Perturbations

Yang Song, Qiyu Kang, Sijie Wang et al.

Neural diffusion on graphs is a novel class of graph neural networks that has attracted increasing attention recently. The capability of graph neural partial differential equations (PDEs) in addressing common hurdles of graph neural networks (GNNs), such as the problems of over-smoothing and bottlenecks, has been investigated but not their robustness to adversarial attacks. In this work, we explore the robustness properties of graph neural PDEs. We empirically demonstrate that graph neural PDEs are intrinsically more robust against topology perturbation as compared to other GNNs. We provide insights into this phenomenon by exploiting the stability of the heat semigroup under graph topology perturbations. We discuss various graph diffusion operators and relate them to existing graph neural PDEs. Furthermore, we propose a general graph neural PDE framework based on which a new class of robust GNNs can be defined. We verify that the new model achieves comparable state-of-the-art performance on several benchmark datasets.

CVNov 7, 2023
RobustMat: Neural Diffusion for Street Landmark Patch Matching under Challenging Environments

Rui She, Qiyu Kang, Sijie Wang et al.

For autonomous vehicles (AVs), visual perception techniques based on sensors like cameras play crucial roles in information acquisition and processing. In various computer perception tasks for AVs, it may be helpful to match landmark patches taken by an onboard camera with other landmark patches captured at a different time or saved in a street scene image database. To perform matching under challenging driving environments caused by changing seasons, weather, and illumination, we utilize the spatial neighborhood information of each patch. We propose an approach, named RobustMat, which derives its robustness to perturbations from neural differential equations. A convolutional neural ODE diffusion module is used to learn the feature representation for the landmark patches. A graph neural PDE diffusion module then aggregates information from neighboring landmark patches in the street scene. Finally, feature similarity learning outputs the final matching score. Our approach is evaluated on several street scene datasets and demonstrated to achieve state-of-the-art matching results under environmental perturbations.

CVNov 8, 2023
Image Patch-Matching with Graph-Based Learning in Street Scenes

Rui She, Qiyu Kang, Sijie Wang et al.

Matching landmark patches from a real-time image captured by an on-vehicle camera with landmark patches in an image database plays an important role in various computer perception tasks for autonomous driving. Current methods focus on local matching for regions of interest and do not take into account spatial neighborhood relationships among the image patches, which typically correspond to objects in the environment. In this paper, we construct a spatial graph with the graph vertices corresponding to patches and edges capturing the spatial neighborhood information. We propose a joint feature and metric learning model with graph-based learning. We provide a theoretical basis for the graph-based loss by showing that the information distance between the distributions conditioned on matched and unmatched pairs is maximized under our framework. We evaluate our model using several street-scene datasets and demonstrate that our approach achieves state-of-the-art matching results.

LGMar 2, 2023
Node Embedding from Hamiltonian Information Propagation in Graph Neural Networks

Qiyu Kang, Kai Zhao, Yang Song et al.

Graph neural networks (GNNs) have achieved success in various inference tasks on graph-structured data. However, common challenges faced by many GNNs in the literature include the problem of graph node embedding under various geometries and the over-smoothing problem. To address these issues, we propose a novel graph information propagation strategy called Hamiltonian Dynamic GNN (HDG) that uses a Hamiltonian mechanics approach to learn node embeddings in a graph. The Hamiltonian energy function in HDG is learnable and can adapt to the underlying geometry of any given graph dataset. We demonstrate the ability of HDG to automatically learn the underlying geometry of graph datasets, even those with complex and mixed geometries, through comprehensive evaluations against state-of-the-art baselines on various downstream tasks. We also verify that HDG is stable against small perturbations and can mitigate the over-smoothing problem when stacking many layers.

LGApr 26, 2024Code
Unleashing the Potential of Fractional Calculus in Graph Neural Networks with FROND

Qiyu Kang, Kai Zhao, Qinxu Ding et al.

We introduce the FRactional-Order graph Neural Dynamical network (FROND), a new continuous graph neural network (GNN) framework. Unlike traditional continuous GNNs that rely on integer-order differential equations, FROND employs the Caputo fractional derivative to leverage the non-local properties of fractional calculus. This approach enables the capture of long-term dependencies in feature updates, moving beyond the Markovian update mechanisms in conventional integer-order models and offering enhanced capabilities in graph representation learning. We offer an interpretation of the node feature updating process in FROND from a non-Markovian random walk perspective when the feature updating is particularly governed by a diffusion process. We demonstrate analytically that oversmoothing can be mitigated in this setting. Experimentally, we validate the FROND framework by comparing the fractional adaptations of various established integer-order continuous GNNs, demonstrating their consistently improved performance and underscoring the framework's potential as an effective extension to enhance traditional continuous GNNs. The code is available at \url{https://github.com/zknus/ICLR2024-FROND}.

CVDec 17, 2023Code
DistilVPR: Cross-Modal Knowledge Distillation for Visual Place Recognition

Sijie Wang, Rui She, Qiyu Kang et al.

The utilization of multi-modal sensor data in visual place recognition (VPR) has demonstrated enhanced performance compared to single-modal counterparts. Nonetheless, integrating additional sensors comes with elevated costs and may not be feasible for systems that demand lightweight operation, thereby impacting the practical deployment of VPR. To address this issue, we resort to knowledge distillation, which empowers single-modal students to learn from cross-modal teachers without introducing additional sensors during inference. Despite the notable advancements achieved by current distillation approaches, the exploration of feature relationships remains an under-explored area. In order to tackle the challenge of cross-modal distillation in VPR, we present DistilVPR, a novel distillation pipeline for VPR. We propose leveraging feature relationships from multiple agents, including self-agents and cross-agents for teacher and student neural networks. Furthermore, we integrate various manifolds, characterized by different space curvatures for exploring feature relationships. This approach enhances the diversity of feature relationships, including Euclidean, spherical, and hyperbolic relationship modules, thereby enhancing the overall representational capacity. The experiments demonstrate that our proposed pipeline achieves state-of-the-art performance compared to other distillation baselines. We also conduct necessary ablation studies to show design effectiveness. The code is released at: https://github.com/sijieaaa/DistilVPR

LGNov 8, 2024Code
Distributed-Order Fractional Graph Operating Network

Kai Zhao, Xuhao Li, Qiyu Kang et al.

We introduce the Distributed-order fRActional Graph Operating Network (DRAGON), a novel continuous Graph Neural Network (GNN) framework that incorporates distributed-order fractional calculus. Unlike traditional continuous GNNs that utilize integer-order or single fractional-order differential equations, DRAGON uses a learnable probability distribution over a range of real numbers for the derivative orders. By allowing a flexible and learnable superposition of multiple derivative orders, our framework captures complex graph feature updating dynamics beyond the reach of conventional models. We provide a comprehensive interpretation of our framework's capability to capture intricate dynamics through the lens of a non-Markovian graph random walk with node feature updating driven by an anomalous diffusion process over the graph. Furthermore, to highlight the versatility of the DRAGON framework, we conduct empirical evaluations across a range of graph learning tasks. The results consistently demonstrate superior performance when compared to traditional continuous GNN models. The implementation code is available at \url{https://github.com/zknus/NeurIPS-2024-DRAGON}.

CVJan 6, 2024Code
PosDiffNet: Positional Neural Diffusion for Point Cloud Registration in a Large Field of View with Perturbations

Rui She, Sijie Wang, Qiyu Kang et al.

Point cloud registration is a crucial technique in 3D computer vision with a wide range of applications. However, this task can be challenging, particularly in large fields of view with dynamic objects, environmental noise, or other perturbations. To address this challenge, we propose a model called PosDiffNet. Our approach performs hierarchical registration based on window-level, patch-level, and point-level correspondence. We leverage a graph neural partial differential equation (PDE) based on Beltrami flow to obtain high-dimensional features and position embeddings for point clouds. We incorporate position embeddings into a Transformer module based on a neural ordinary differential equation (ODE) to efficiently represent patches within points. We employ the multi-level correspondence derived from the high feature similarity scores to facilitate alignment between point clouds. Subsequently, we use registration methods such as SVD-based algorithms to predict the transformation using corresponding point pairs. We evaluate PosDiffNet on several 3D point cloud datasets, verifying that it achieves state-of-the-art (SOTA) performance for point cloud registration in large fields of view with perturbations. The implementation code of experiments is available at https://github.com/AI-IT-AVs/PosDiffNet.

LGMay 30, 2023Code
Node Embedding from Neural Hamiltonian Orbits in Graph Neural Networks

Qiyu Kang, Kai Zhao, Yang Song et al.

In the graph node embedding problem, embedding spaces can vary significantly for different data types, leading to the need for different GNN model types. In this paper, we model the embedding update of a node feature as a Hamiltonian orbit over time. Since the Hamiltonian orbits generalize the exponential maps, this approach allows us to learn the underlying manifold of the graph in training, in contrast to most of the existing literature that assumes a fixed graph embedding manifold with a closed exponential map solution. Our proposed node embedding strategy can automatically learn, without extensive tuning, the underlying geometry of any given graph dataset even if it has diverse geometries. We test Hamiltonian functions of different forms and verify the performance of our approach on two graph node embedding downstream tasks: node classification and link prediction. Numerical experiments demonstrate that our approach adapts better to different types of graph datasets than popular state-of-the-art graph node embedding GNNs. The code is available at \url{https://github.com/zknus/Hamiltonian-GNN}.

LGMay 26, 2023Code
Graph Neural Convection-Diffusion with Heterophily

Kai Zhao, Qiyu Kang, Yang Song et al.

Graph neural networks (GNNs) have shown promising results across various graph learning tasks, but they often assume homophily, which can result in poor performance on heterophilic graphs. The connected nodes are likely to be from different classes or have dissimilar features on heterophilic graphs. In this paper, we propose a novel GNN that incorporates the principle of heterophily by modeling the flow of information on nodes using the convection-diffusion equation (CDE). This allows the CDE to take into account both the diffusion of information due to homophily and the ``convection'' of information due to heterophily. We conduct extensive experiments, which suggest that our framework can achieve competitive performance on node classification tasks for heterophilic graphs, compared to the state-of-the-art methods. The code is available at \url{https://github.com/zknus/Graph-Diffusion-CDE}.

12.9MAApr 10
Multi-agent Reinforcement Learning for Low-Carbon P2P Energy Trading among Self-Interested Microgrids

Junhao Ren, Honglin Gao, Lan Zhao et al.

Uncertainties in renewable generation and demand dynamics challenge day-ahead scheduling. To enhance renewable penetration and maintain intra-day balance, we develop a multi-agent reinforcement learning framework for self-interested microgrids participating in peer-to-peer (P2P) electricity trading. Each microgrid independently bids both price and quantity while optimizing its own profit via storage arbitrage under time-varying main-grid prices. A market-clearing mechanism coordinating trades and promoting incentive compatibility is proposed. Simulation results show that the learned bidding policy improves renewable utilization and reduces reliance on high-carbon electricity, while increasing community-level economic welfare, delivering a win-win situation in emission reduction and local prosperity.

CVApr 22, 2024
PointDifformer: Robust Point Cloud Registration With Neural Diffusion and Transformer

Rui She, Qiyu Kang, Sijie Wang et al.

Point cloud registration is a fundamental technique in 3-D computer vision with applications in graphics, autonomous driving, and robotics. However, registration tasks under challenging conditions, under which noise or perturbations are prevalent, can be difficult. We propose a robust point cloud registration approach that leverages graph neural partial differential equations (PDEs) and heat kernel signatures. Our method first uses graph neural PDE modules to extract high dimensional features from point clouds by aggregating information from the 3-D point neighborhood, thereby enhancing the robustness of the feature representations. Then, we incorporate heat kernel signatures into an attention mechanism to efficiently obtain corresponding keypoints. Finally, a singular value decomposition (SVD) module with learnable weights is used to predict the transformation between two point clouds. Empirical experiments on a 3-D point cloud dataset demonstrate that our approach not only achieves state-of-the-art performance for point cloud registration but also exhibits better robustness to additive noise or 3-D shape perturbations.

LGJan 9, 2024
Coupling Graph Neural Networks with Fractional Order Continuous Dynamics: A Robustness Study

Qiyu Kang, Kai Zhao, Yang Song et al.

In this work, we rigorously investigate the robustness of graph neural fractional-order differential equation (FDE) models. This framework extends beyond traditional graph neural (integer-order) ordinary differential equation (ODE) models by implementing the time-fractional Caputo derivative. Utilizing fractional calculus allows our model to consider long-term memory during the feature updating process, diverging from the memoryless Markovian updates seen in traditional graph neural ODE models. The superiority of graph neural FDE models over graph neural ODE models has been established in environments free from attacks or perturbations. While traditional graph neural ODE models have been verified to possess a degree of stability and resilience in the presence of adversarial attacks in existing literature, the robustness of graph neural FDE models, especially under adversarial conditions, remains largely unexplored. This paper undertakes a detailed assessment of the robustness of graph neural FDE models. We establish a theoretical foundation outlining the robustness characteristics of graph neural FDE models, highlighting that they maintain more stringent output perturbation bounds in the face of input and graph topology disturbances, compared to their integer-order counterparts. Our empirical evaluations further confirm the enhanced robustness of graph neural FDE models, highlighting their potential in adversarially robust applications.

CVAug 19, 2025
AIM 2025 challenge on Inverse Tone Mapping Report: Methods and Results

Chao Wang, Francesco Banterle, Bin Ren et al.

This paper presents a comprehensive review of the AIM 2025 Challenge on Inverse Tone Mapping (ITM). The challenge aimed to push forward the development of effective ITM algorithms for HDR image reconstruction from single LDR inputs, focusing on perceptual fidelity and numerical consistency. A total of \textbf{67} participants submitted \textbf{319} valid results, from which the best five teams were selected for detailed analysis. This report consolidates their methodologies and performance, with the lowest PU21-PSNR among the top entries reaching 29.22 dB. The analysis highlights innovative strategies for enhancing HDR reconstruction quality and establishes strong benchmarks to guide future research in inverse tone mapping.

LGMar 20, 2025
Neural Variable-Order Fractional Differential Equation Networks

Wenjun Cui, Qiyu Kang, Xuhao Li et al.

Neural differential equation models have garnered significant attention in recent years for their effectiveness in machine learning applications.Among these, fractional differential equations (FDEs) have emerged as a promising tool due to their ability to capture memory-dependent dynamics, which are often challenging to model with traditional integer-order approaches.While existing models have primarily focused on constant-order fractional derivatives, variable-order fractional operators offer a more flexible and expressive framework for modeling complex memory patterns. In this work, we introduce the Neural Variable-Order Fractional Differential Equation network (NvoFDE), a novel neural network framework that integrates variable-order fractional derivatives with learnable neural networks.Our framework allows for the modeling of adaptive derivative orders dependent on hidden features, capturing more complex feature-updating dynamics and providing enhanced flexibility. We conduct extensive experiments across multiple graph datasets to validate the effectiveness of our approach.Our results demonstrate that NvoFDE outperforms traditional constant-order fractional and integer models across a range of tasks, showcasing its superior adaptability and performance.

LGMar 20, 2025
Efficient Training of Neural Fractional-Order Differential Equation via Adjoint Backpropagation

Qiyu Kang, Xuhao Li, Kai Zhao et al.

Fractional-order differential equations (FDEs) enhance traditional differential equations by extending the order of differential operators from integers to real numbers, offering greater flexibility in modeling complex dynamical systems with nonlocal characteristics. Recent progress at the intersection of FDEs and deep learning has catalyzed a new wave of innovative models, demonstrating the potential to address challenges such as graph representation learning. However, training neural FDEs has primarily relied on direct differentiation through forward-pass operations in FDE numerical solvers, leading to increased memory usage and computational complexity, particularly in large-scale applications. To address these challenges, we propose a scalable adjoint backpropagation method for training neural FDEs by solving an augmented FDE backward in time, which substantially reduces memory requirements. This approach provides a practical neural FDE toolbox and holds considerable promise for diverse applications. We demonstrate the effectiveness of our method in several tasks, achieving performance comparable to baseline models while significantly reducing computational overhead.

CVSep 8, 2025
AIM 2025 Challenge on High FPS Motion Deblurring: Methods and Results

George Ciubotariu, Florin-Alexandru Vasluianu, Zhuyun Zhou et al.

This paper presents a comprehensive review of the AIM 2025 High FPS Non-Uniform Motion Deblurring Challenge, highlighting the proposed solutions and final results. The objective of this challenge is to identify effective networks capable of producing clearer and visually compelling images in diverse and challenging conditions, by learning representative visual cues for complex aggregations of motion types. A total of 68 participants registered for the competition, and 9 teams ultimately submitted valid entries. This paper thoroughly evaluates the state-of-the-art advances in high-FPS single image motion deblurring, showcasing the significant progress in the field, while leveraging samples of the novel dataset, MIORe, that introduces challenging examples of movement patterns.

16.8MAApr 3
Multi-agent Reinforcement Learning-based Joint Design of Low-Carbon P2P Market and Bidding Strategy in Microgrids

Junhao Ren, Honglin Gao, Sijie Wang et al.

The challenges of the uncertainties in renewable energy generation and the instability of the real-time market limit the effective utilization of clean energy in microgrid communities. Existing peer-to-peer (P2P) and microgrid coordination approaches typically rely on certain centralized optimization or restrictive coordination rules which are difficult to be implemented in real-life applications. To address the challenge, we propose an intraday P2P trading framework that allows self-interested microgrids to pursue their economic benefits, while allowing the market operator to maximize the social welfare, namely the low carbon emission objective, of the entire community. Specifically, the decision-making processes of the microgrids are formulated as a Decentralized Partially Observable Markov Decision Process (DEC-POMDP) and solved using a Multi-Agent Reinforcement Learning (MARL) framework. Such an approach grants each microgrid a high degree of decision-making autonomy, while a novel market clearing mechanism is introduced to provide macro-regulation, incentivizing microgrids to prioritize local renewable energy consumption and hence reduce carbon emissions. Simulation results demonstrate that the combination of the self-interested bidding strategy and the P2P market design helps significantly improve renewable energy utilization and reduce reliance on external electricity with high carbon-emissions. The framework achieves a balanced integration of local autonomy, self-interest pursuit, and improved community-level economic and environmental benefits.

58.9LGMar 13
Lyapunov Stable Graph Neural Flow

Haoyu Chu, Xiaotong Chen, Wei Zhou et al.

Graph Neural Networks (GNNs) are highly vulnerable to adversarial perturbations in both topology and features, making the learning of robust representations a critical challenge. In this work, we bridge GNNs with control theory to introduce a novel defense framework grounded in integer- and fractional-order Lyapunov stability. Unlike conventional strategies that rely on resource-heavy adversarial training or data purification, our approach fundamentally constrains the underlying feature-update dynamics of the GNN. We propose an adaptive, learnable Lyapunov function paired with a novel projection mechanism that maps the network's state into a stable space, thereby offering theoretically provable stability guarantees. Notably, this mechanism is orthogonal to existing defenses, allowing for seamless integration with techniques like adversarial training to achieve cumulative robustness. Extensive experiments demonstrate that our Lyapunov-stable graph neural flows substantially outperform base neural flows and state-of-the-art baselines across standard benchmarks and various adversarial attack scenarios.

LGApr 23, 2025
Simple Graph Contrastive Learning via Fractional-order Neural Diffusion Networks

Yanan Zhao, Feng Ji, Kai Zhao et al.

Graph Contrastive Learning (GCL) has recently made progress as an unsupervised graph representation learning paradigm. GCL approaches can be categorized into augmentation-based and augmentation-free methods. The former relies on complex data augmentations, while the latter depends on encoders that can generate distinct views of the same input. Both approaches may require negative samples for training. In this paper, we introduce a novel augmentation-free GCL framework based on graph neural diffusion models. Specifically, we utilize learnable encoders governed by Fractional Differential Equations (FDE). Each FDE is characterized by an order parameter of the differential operator. We demonstrate that varying these parameters allows us to produce learnable encoders that generate diverse views, capturing either local or global information, for contrastive learning. Our model does not require negative samples for training and is applicable to both homophilic and heterophilic datasets. We demonstrate its effectiveness across various datasets, achieving state-of-the-art performance.

LGOct 25, 2021
Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Qiyu Kang, Yang Song, Qinxu Ding et al.

Deep neural networks (DNNs) are well-known to be vulnerable to adversarial attacks, where malicious human-imperceptible perturbations are included in the input to the deep network to fool it into making a wrong classification. Recent studies have demonstrated that neural Ordinary Differential Equations (ODEs) are intrinsically more robust against adversarial attacks compared to vanilla DNNs. In this work, we propose a stable neural ODE with Lyapunov-stable equilibrium points for defending against adversarial attacks (SODEF). By ensuring that the equilibrium points of the ODE solution used as part of SODEF is Lyapunov-stable, the ODE solution for an input with a small perturbation converges to the same solution as the unperturbed input. We provide theoretical results that give insights into the stability of SODEF as well as the choice of regularizers to ensure its stability. Our analysis suggests that our proposed regularizers force the extracted feature points to be within a neighborhood of the Lyapunov-stable equilibrium points of the ODE. SODEF is compatible with many defense methods and can be applied to any neural network's final regressor layer to enhance its stability against adversarial attacks.

LGNov 30, 2019
Error-Correcting Output Codes with Ensemble Diversity for Robust Learning in Neural Networks

Yang Song, Qiyu Kang, Wee Peng Tay

Though deep learning has been applied successfully in many scenarios, malicious inputs with human-imperceptible perturbations can make it vulnerable in real applications. This paper proposes an error-correcting neural network (ECNN) that combines a set of binary classifiers to combat adversarial examples in the multi-class classification problem. To build an ECNN, we propose to design a code matrix so that the minimum Hamming distance between any two rows (i.e., two codewords) and the minimum shared information distance between any two columns (i.e., two partitions of class labels) are simultaneously maximized. Maximizing row distances can increase the system fault tolerance while maximizing column distances helps increase the diversity between binary classifiers. We propose an end-to-end training method for our ECNN, which allows further improvement of the diversity between binary classifiers. The end-to-end training renders our proposed ECNN different from the traditional error-correcting output code (ECOC) based methods that train binary classifiers independently. ECNN is complementary to other existing defense approaches such as adversarial training and can be applied in conjunction with them. We empirically demonstrate that our proposed ECNN is effective against the state-of-the-art white-box and black-box attacks on several datasets while maintaining good classification accuracy on normal examples.

LGJun 26, 2019
Learning Orthogonal Projections in Linear Bandits

Qiyu Kang, Wee Peng Tay

In a linear stochastic bandit model, each arm is a vector in an Euclidean space and the observed return at each time step is an unknown linear function of the chosen arm at that time step. In this paper, we investigate the problem of learning the best arm in a linear stochastic bandit model, where each arm's expected reward is an unknown linear function of the projection of the arm onto a subspace. We call this the projection reward. Unlike the classical linear bandit problem in which the observed return corresponds to the reward, the projection reward at each time step is unobservable. Such a model is useful in recommendation applications where the observed return includes corruption by each individual's biases, which we wish to exclude in the learned model. In the case where there are finitely many arms, we develop a strategy to achieve $O(|\bbD|\log n)$ regret, where $n$ is the number of time steps and $|\bbD|$ is the number of arms. In the case where each arm is chosen from an infinite compact set, our strategy achieves $O(n^{2/3}(\log{n})^{1/2})$ regret. Experiments verify the efficiency of our strategy.

HCJul 27, 2018
Task Recommendation in Crowdsourcing Based on Learning Preferences and Reliabilities

Qiyu Kang, Wee Peng Tay

Workers participating in a crowdsourcing platform can have a wide range of abilities and interests. An important problem in crowdsourcing is the task recommendation problem, in which tasks that best match a particular worker's preferences and reliabilities are recommended to that worker. A task recommendation scheme that assigns tasks more likely to be accepted by a worker who is more likely to complete it reliably results in better performance for the task requester. Without prior information about a worker, his preferences and reliabilities need to be learned over time. In this paper, we propose a multi-armed bandit (MAB) framework to learn a worker's preferences and his reliabilities for different categories of tasks. However, unlike the classical MAB problem, the reward from the worker's completion of a task is unobservable. We therefore include the use of gold tasks (i.e., tasks whose solutions are known \emph{a priori} and which do not produce any rewards) in our task recommendation procedure. Our model could be viewed as a new variant of MAB, in which the random rewards can only be observed at those time steps where gold tasks are used, and the accuracy of estimating the expected reward of recommending a task to a worker depends on the number of gold tasks used. We show that the optimal regret is $O(\sqrt{n})$, where $n$ is the number of tasks recommended to the worker. We develop three task recommendation strategies to determine the number of gold tasks for different task categories, and show that they are order optimal. Simulations verify the efficiency of our approaches.

HCNov 6, 2017
Sequential Multi-Class Labeling in Crowdsourcing

Qiyu Kang, Wee Peng Tay

We consider a crowdsourcing platform where workers' responses to questions posed by a crowdsourcer are used to determine the hidden state of a multi-class labeling problem. As workers may be unreliable, we propose to perform sequential questioning in which the questions posed to the workers are designed based on previous questions and answers. We propose a Partially-Observable Markov Decision Process (POMDP) framework to determine the best questioning strategy, subject to the crowdsourcer's budget constraint. As this POMDP formulation is in general intractable, we develop a suboptimal approach based on a $q$-ary Ulam-Rényi game. We also propose a sampling heuristic, which can be used in tandem with standard POMDP solvers, using our Ulam-Rényi strategy. We demonstrate through simulations that our approaches outperform a non-sequential strategy based on error correction coding and which does not utilize workers' previous responses.