Zhigang Li

CV
h-index8
14papers
493citations
Novelty53%
AI Score35

14 Papers

SYFeb 23, 2016
Data-Driven Real-Time Power Dispatch for Maximizing Variable Renewable Generation

Zhigang Li, Feng Qiu, Jianhui Wang

Traditional power dispatch methods have difficulties in accommodating large-scale variable renewable generation (VRG) and have resulted in unnecessary VRG spillage in the practical industry. The recent dispatchable-interval-based methods have the potential to reduce VRG curtailment, but the dispatchable intervals are not allocated effectively due to the lack of exploiting historical dispatch records of VRG units. To bridge this gap, this paper proposes a novel data-driven real-time dispatch approach to maximize VRG utili-zation by using do-not-exceed (DNE) limits. This approach defines the maximum generation output ranges that the system can ac-commodate without compromising reliability. The DNE limits of VRG units and operating base points of conventional units are co-optimized by hybrid stochastic and robust optimization, and the decision models are formulated as mixed-integer linear programs by the sample average approximation technique exploiting historical VRG data. A strategy for selecting historical data samples is also proposed to capture the VRG uncertainty more accurately under variant prediction output levels. Computational experiments show the effectiveness of the proposed methods.

LGMar 4, 2023
DAG Matters! GFlowNets Enhanced Explainer For Graph Neural Networks

Wenqian Li, Yinchuan Li, Zhigang Li et al. · tsinghua

Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over the years. Existing literature mainly focus on selecting a subgraph, through combinatorial optimization, to provide faithful explanations. However, the exponential size of candidate subgraphs limits the applicability of state-of-the-art methods to large-scale GNNs. We enhance on this through a different approach: by proposing a generative structure -- GFlowNets-based GNN Explainer (GFlowExplainer), we turn the optimization problem into a step-by-step generative problem. Our GFlowExplainer aims to learn a policy that generates a distribution of subgraphs for which the probability of a subgraph is proportional to its' reward. The proposed approach eliminates the influence of node sequence and thus does not need any pre-training strategies. We also propose a new cut vertex matrix to efficiently explore parent states for GFlowNets structure, thus making our approach applicable in a large-scale setting. We conduct extensive experiments on both synthetic and real datasets, and both qualitative and quantitative results show the superiority of our GFlowExplainer.

LGApr 24, 2023
Generative Flow Networks for Precise Reward-Oriented Active Learning on Graphs

Yinchuan Li, Zhigang Li, Wenqian Li et al. · tsinghua

Many score-based active learning methods have been successfully applied to graph-structured data, aiming to reduce the number of labels and achieve better performance of graph neural networks based on predefined score functions. However, these algorithms struggle to learn policy distributions that are proportional to rewards and have limited exploration capabilities. In this paper, we innovatively formulate the graph active learning problem as a generative process, named GFlowGNN, which generates various samples through sequential actions with probabilities precisely proportional to a predefined reward function. Furthermore, we propose the concept of flow nodes and flow features to efficiently model graphs as flows based on generative flow networks, where the policy network is trained with specially designed rewards. Extensive experiments on real datasets show that the proposed approach has good exploration capability and transferability, outperforming various state-of-the-art methods.

CVNov 8, 2023
Enhancing Few-shot CLIP with Semantic-Aware Fine-Tuning

Yao Zhu, Yuefeng Chen, Wei Wang et al.

Learning generalized representations from limited training samples is crucial for applying deep neural networks in low-resource scenarios. Recently, methods based on Contrastive Language-Image Pre-training (CLIP) have exhibited promising performance in few-shot adaptation tasks. To avoid catastrophic forgetting and overfitting caused by few-shot fine-tuning, existing works usually freeze the parameters of CLIP pre-trained on large-scale datasets, overlooking the possibility that some parameters might not be suitable for downstream tasks. To this end, we revisit CLIP's visual encoder with a specific focus on its distinctive attention pooling layer, which performs a spatial weighted-sum of the dense feature maps. Given that dense feature maps contain meaningful semantic information, and different semantics hold varying importance for diverse downstream tasks (such as prioritizing semantics like ears and eyes in pet classification tasks rather than side mirrors), using the same weighted-sum operation for dense features across different few-shot tasks might not be appropriate. Hence, we propose fine-tuning the parameters of the attention pooling layer during the training process to encourage the model to focus on task-specific semantics. In the inference process, we perform residual blending between the features pooled by the fine-tuned and the original attention pooling layers to incorporate both the few-shot knowledge and the pre-trained CLIP's prior knowledge. We term this method as Semantic-Aware FinE-tuning (SAFE). SAFE is effective in enhancing the conventional few-shot CLIP and is compatible with the existing adapter approach (termed SAFE-A).

LGMar 6, 2024Code
Advancing Out-of-Distribution Detection through Data Purification and Dynamic Activation Function Design

Yingrui Ji, Yao Zhu, Zhigang Li et al.

In the dynamic realms of machine learning and deep learning, the robustness and reliability of models are paramount, especially in critical real-world applications. A fundamental challenge in this sphere is managing Out-of-Distribution (OOD) samples, significantly increasing the risks of model misclassification and uncertainty. Our work addresses this challenge by enhancing the detection and management of OOD samples in neural networks. We introduce OOD-R (Out-of-Distribution-Rectified), a meticulously curated collection of open-source datasets with enhanced noise reduction properties. In-Distribution (ID) noise in existing OOD datasets can lead to inaccurate evaluation of detection algorithms. Recognizing this, OOD-R incorporates noise filtering technologies to refine the datasets, ensuring a more accurate and reliable evaluation of OOD detection algorithms. This approach not only improves the overall quality of data but also aids in better distinguishing between OOD and ID samples, resulting in up to a 2.5\% improvement in model accuracy and a minimum 3.2\% reduction in false positives. Furthermore, we present ActFun, an innovative method that fine-tunes the model's response to diverse inputs, thereby improving the stability of feature extraction and minimizing specificity issues. ActFun addresses the common problem of model overconfidence in OOD detection by strategically reducing the influence of hidden units, which enhances the model's capability to estimate OOD uncertainty more accurately. Implementing ActFun in the OOD-R dataset has led to significant performance enhancements, including an 18.42\% increase in AUROC of the GradNorm method and a 16.93\% decrease in FPR95 of the Energy method. Overall, our research not only advances the methodologies in OOD detection but also emphasizes the importance of dataset integrity for accurate algorithm evaluation.

CVFeb 24, 2021Code
GDRNPP: A Geometry-guided and Fully Learning-based Object Pose Estimator

Xingyu Liu, Ruida Zhang, Chenyangguang Zhang et al.

6D pose estimation of rigid objects is a long-standing and challenging task in computer vision. Recently, the emergence of deep learning reveals the potential of Convolutional Neural Networks (CNNs) to predict reliable 6D poses. Given that direct pose regression networks currently exhibit suboptimal performance, most methods still resort to traditional techniques to varying degrees. For example, top-performing methods often adopt an indirect strategy by first establishing 2D-3D or 3D-3D correspondences followed by applying the RANSAC-based PnP or Kabsch algorithms, and further employing ICP for refinement. Despite the performance enhancement, the integration of traditional techniques makes the networks time-consuming and not end-to-end trainable. Orthogonal to them, this paper introduces a fully learning-based object pose estimator. In this work, we first perform an in-depth investigation of both direct and indirect methods and propose a simple yet effective Geometry-guided Direct Regression Network (GDRN) to learn the 6D pose from monocular images in an end-to-end manner. Afterwards, we introduce a geometry-guided pose refinement module, enhancing pose accuracy when extra depth data is available. Guided by the predicted coordinate map, we build an end-to-end differentiable architecture that establishes robust and accurate 3D-3D correspondences between the observed and rendered RGB-D images to refine the pose. Our enhanced pose estimation pipeline GDRNPP (GDRN Plus Plus) conquered the leaderboard of the BOP Challenge for two consecutive years, becoming the first to surpass all prior methods that relied on traditional techniques in both accuracy and speed. The code and models are available at https://github.com/shanice-l/gdrnpp_bop2022.

CVDec 16, 2016Code
Output Constraint Transfer for Kernelized Correlation Filter in Tracking

Baochang Zhang, Zhigang Li, Xianbin Cao et al.

Kernelized Correlation Filter (KCF) is one of the state-of-the-art object trackers. However, it does not reasonably model the distribution of correlation response during tracking process, which might cause the drifting problem, especially when targets undergo significant appearance changes due to occlusion, camera shaking, and/or deformation. In this paper, we propose an Output Constraint Transfer (OCT) method that by modeling the distribution of correlation response in a Bayesian optimization framework is able to mitigate the drifting problem. OCT builds upon the reasonable assumption that the correlation response to the target image follows a Gaussian distribution, which we exploit to select training samples and reduce model uncertainty. OCT is rooted in a new theory which transfers data distribution to a constraint of the optimized variable, leading to an efficient framework to calculate correlation filters. Extensive experiments on a commonly used tracking benchmark show that the proposed method significantly improves KCF, and achieves better performance than other state-of-the-art trackers. To encourage further developments, the source code is made available https://github.com/bczhangbczhang/OCT-KCF.

CLMar 21, 2025
A Language Anchor-Guided Method for Robust Noisy Domain Generalization

Zilin Dai, Lehong Wang, Fangzhou Lin et al.

Real-world machine learning applications often struggle with two major challenges: distribution shift and label noise. Models tend to overfit by focusing on redundant and uninformative features in the training data, which makes it hard for them to generalize to the target domain. Noisy data worsens this problem by causing further overfitting to the noise, meaning that existing methods often fail to tell the difference between true, invariant features and misleading, spurious ones. To tackle these issues, we introduce Anchor Alignment and Adaptive Weighting (A3W). This new algorithm uses sample reweighting guided by natural language processing (NLP) anchors to extract more representative features. In simple terms, A3W leverages semantic representations from natural language models as a source of domain-invariant prior knowledge. Additionally, it employs a weighted loss function that adjusts each sample's contribution based on its similarity to the corresponding NLP anchor. This adjustment makes the model more robust to noisy labels. Extensive experiments on standard benchmark datasets show that A3W consistently outperforms state-of-the-art domain generalization methods, offering significant improvements in both accuracy and robustness across different datasets and noise levels.

CVFeb 24, 2021
PFRL: Pose-Free Reinforcement Learning for 6D Pose Estimation

Jianzhun Shao, Yuhang Jiang, Gu Wang et al.

6D pose estimation from a single RGB image is a challenging and vital task in computer vision. The current mainstream deep model methods resort to 2D images annotated with real-world ground-truth 6D object poses, whose collection is fairly cumbersome and expensive, even unavailable in many cases. In this work, to get rid of the burden of 6D annotations, we formulate the 6D pose refinement as a Markov Decision Process and impose on the reinforcement learning approach with only 2D image annotations as weakly-supervised 6D pose information, via a delicate reward definition and a composite reinforced optimization method for efficient and effective policy training. Experiments on LINEMOD and T-LESS datasets demonstrate that our Pose-Free approach is able to achieve state-of-the-art performance compared with the methods without using real-world ground-truth 6D pose labels.

CVAug 19, 2020
Robust RGB-based 6-DoF Pose Estimation without Real Pose Annotations

Zhigang Li, Yinlin Hu, Mathieu Salzmann et al.

While much progress has been made in 6-DoF object pose estimation from a single RGB image, the current leading approaches heavily rely on real-annotation data. As such, they remain sensitive to severe occlusions, because covering all possible occlusions with annotated data is intractable. In this paper, we introduce an approach to robustly and accurately estimate the 6-DoF pose in challenging conditions and without using any real pose annotations. To this end, we leverage the intuition that the poses predicted by a network from an image and from its counterpart synthetically altered to mimic occlusion should be consistent, and translate this to a self-supervised loss function. Our experiments on LINEMOD, Occluded-LINEMOD, YCB and new Randomization LINEMOD dataset evidence the robustness of our approach. We achieve state of the art performance on LINEMOD, and OccludedLINEMOD in without real-pose setting, even outperforming methods that rely on real annotations during training on Occluded-LINEMOD.

LGAug 5, 2020
Optimizing AD Pruning of Sponsored Search with Reinforcement Learning

Yijiang Lian, Zhijie Chen, Xin Pei et al.

Industrial sponsored search system (SSS) can be logically divided into three modules: keywords matching, ad retrieving, and ranking. During ad retrieving, the ad candidates grow exponentially. A query with high commercial value might retrieve a great deal of ad candidates such that the ranking module could not afford. Due to limited latency and computing resources, the candidates have to be pruned earlier. Suppose we set a pruning line to cut SSS into two parts: upstream and downstream. The problem we are going to address is: how to pick out the best $K$ items from $N$ candidates provided by the upstream to maximize the total system's revenue. Since the industrial downstream is very complicated and updated quickly, a crucial restriction in this problem is that the selection scheme should get adapted to the downstream. In this paper, we propose a novel model-free reinforcement learning approach to fixing this problem. Our approach considers downstream as a black-box environment, and the agent sequentially selects items and finally feeds into the downstream, where revenue would be estimated and used as a reward to improve the selection policy. To the best of our knowledge, this is first time to consider the system optimization from a downstream adaption view. It is also the first time to use reinforcement learning techniques to tackle this problem. The idea has been successfully realized in Baidu's sponsored search system, and online long time A/B test shows remarkable improvements on revenue.

IRFeb 2, 2019
An end-to-end Generative Retrieval Method for Sponsored Search Engine --Decoding Efficiently into a Closed Target Domain

Yijiang Lian, Zhijie Chen, Jinlong Hu et al.

In this paper, we present a generative retrieval method for sponsored search engine, which uses neural machine translation (NMT) to generate keywords directly from query. This method is completely end-to-end, which skips query rewriting and relevance judging phases in traditional retrieval systems. Different from standard machine translation, the target space in the retrieval setting is a constrained closed set, where only committed keywords should be generated. We present a Trie-based pruning technique in beam search to address this problem. The biggest challenge in deploying this method into a real industrial environment is the latency impact of running the decoder. Self-normalized training coupled with Trie-based dynamic pruning dramatically reduces the inference time, yielding a speedup of more than 20 times. We also devise an mixed online-offline serving architecture to reduce the latency and CPU consumption. To encourage the NMT to generate new keywords uncovered by the existing system, training data is carefully selected. This model has been successfully applied in Baidu's commercial search engine as a supplementary retrieval branch, which has brought a remarkable revenue improvement of more than 10 percents.

CVJun 10, 2017
Generate Identity-Preserving Faces by Generative Adversarial Networks

Zhigang Li, Yupin Luo

Generating identity-preserving faces aims to generate various face images keeping the same identity given a target face image. Although considerable generative models have been developed in recent years, it is still challenging to simultaneously acquire high quality of facial images and preserve the identity. Here we propose a compelling method using generative adversarial networks (GAN). Concretely, we leverage the generator of trained GAN to generate plausible faces and FaceNet as an identity-similarity discriminator to ensure the identity. Experimental results show that our method is qualified to generate both plausible and identity-preserving faces with high quality. In addition, our method provides a universal framework which can be realized in various ways by combining different face generators and identity-similarity discriminator.

SYAug 16, 2016
Multi-Period Do-Not-Exceed Limit for Variable Renewable Generation Dispatch Considering Discrete Recourse Controls

Zhigang Li, Feng Qiu, Jianhui Wang

The do-not-exceed (DNE) limit method was proposed to accommodate more variable renewable generation (VRG) securely. However, the lack of involving discrete recourse control precludes this method from gaining more flexibility for better VRG integration. This letter formulates a multi-period DNE limit model considering continuous and discrete recourse controls. This model belongs to two-stage robust optimization with mixed integer recourse. A nested column-and-constraint generation approach is employed to solve this model. Case studies show the effectiveness of the proposed method.