Jiaqi Lv

LG
h-index23
23papers
586citations
Novelty57%
AI Score56

23 Papers

LGJun 1, 2022
One Positive Label is Sufficient: Single-Positive Multi-Label Learning with Label Enhancement

Ning Xu, Congyu Qiao, Jiaqi Lv et al.

Multi-label learning (MLL) learns from the examples each associated with multiple labels simultaneously, where the high cost of annotating all relevant labels for each training example is challenging for real-world applications. To cope with the challenge, we investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label, and show that one can successfully learn a theoretically grounded multi-label classifier for the problem. In this paper, a novel SPMLL method named SMILE, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed. Specifically, an unbiased risk estimator is derived, which could be guaranteed to approximately converge to the optimal risk minimizer of fully supervised learning and shows that one positive label of each instance is sufficient to train the predictive model. Then, the corresponding empirical risk estimator is established via recovering the latent soft label as a label enhancement process, where the posterior density of the latent soft labels is approximate to the variational Beta density parameterized by an inference model. Experiments on benchmark datasets validate the effectiveness of the proposed method.

LGJun 2, 2022
Progressive Purification for Instance-Dependent Partial Label Learning

Ning Xu, Biao Liu, Jiaqi Lv et al.

Partial label learning (PLL) aims to train multiclass classifiers from the examples each annotated with a set of candidate labels where a fixed but unknown candidate label is correct. In the last few years, the instance-independent generation process of candidate labels has been extensively studied, on the basis of which many theoretical advances have been made in PLL. Nevertheless, the candidate labels are always instance-dependent in practice and there is no theoretical guarantee that the model trained on the instance-dependent PLL examples can converge to an ideal one. In this paper, a theoretically grounded and practically effective approach named POP, i.e. PrOgressive Purification for instance-dependent partial label learning, is proposed. Specifically, POP updates the learning model and purifies each candidate label set progressively in every epoch. Theoretically, we prove that POP enlarges the region appropriately fast where the model is reliable, and eventually approximates the Bayes optimal classifier with mild assumptions. Technically, POP is flexible with arbitrary PLL losses and could improve the performance of the previous PLL losses in the instance-dependent case. Experiments on the benchmark datasets and the real-world datasets validate the effectiveness of the proposed method.

TOApr 15
A deep learning framework for glomeruli segmentation with boundary attention

Behnaz Elhaminia, Catherine King, Jiaqi Lv et al.

Accurate detection and segmentation of glomeruli in kidney tissue are essential for diagnostic applications. Traditional deep learning methods primarily rely on semantic segmentation, which often fails to precisely delineate adjacent glomeruli. To address this challenge, we propose a novel glomerulus detection and segmentation model that emphasises boundary separation. Leveraging pathology foundation models, the proposed U-Net-based architecture incorporates a specialised attention decoder designed to highlight critical regions and improve instancelevel segmentation. Experimental evaluations demonstrate that our approach surpasses state-of-the-art methods in both Dice score and Intersection over Union, indicating superior performance in glomerular delineation.

LGMar 4
Graph Negative Feedback Bias Correction Framework for Adaptive Heterophily Modeling

Jiaqi Lv, Qingfeng Du, Yu Zhang et al.

Graph Neural Networks (GNNs) have emerged as a powerful framework for processing graph-structured data. However, conventional GNNs and their variants are inherently limited by the homophily assumption, leading to degradation in performance on heterophilic graphs. Although substantial efforts have been made to mitigate this issue, they remain constrained by the message-passing paradigm, which is inherently rooted in homophily. In this paper, a detailed analysis of how the underlying label autocorrelation of the homophily assumption introduces bias into GNNs is presented. We innovatively leverage a negative feedback mechanism to correct the bias and propose Graph Negative Feedback Bias Correction (GNFBC), a simple yet effective framework that is independent of any specific aggregation strategy. Specifically, we introduce a negative feedback loss that penalizes the sensitivity of predictions to label autocorrelation. Furthermore, we incorporate the output of graph-agnostic models as a feedback term, leveraging independent node feature information to counteract correlation-induced bias guided by Dirichlet energy. GNFBC can be seamlessly integrated into existing GNN architectures, improving overall performance with comparable computational and memory overhead.

LGMar 3
Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection

Jiaqi Lv, Qingfeng Du, Yu Zhang et al.

Graph fraud detection (GFD) is crucial for identifying fraudulent behavior within graphs, benefiting various domains such as financial networks and social media. Existing methods based on graph neural networks (GNNs) have succeeded considerably due to their effective expressive capacity for graph-structured data. However, the inherent inductive bias of GNNs, including the homogeneity assumption and the limited global modeling ability, hinder the effectiveness of these models. To address these challenges, we propose Multi-scale Neighborhood Awareness Transformer (MANDATE), which alleviates the inherent inductive bias of GNNs. Specifically, we design a multi-scale positional encoding strategy to encode the positional information of various distances from the central node. By incorporating it with the self-attention mechanism, the global modeling ability can be enhanced significantly. Meanwhile, we design different embedding strategies for homophilic and heterophilic connections. This mitigates the homophily distribution differences between benign and fraudulent nodes. Moreover, an embedding fusion strategy is designed for multi-relation graphs, which alleviates the distribution bias caused by different relationships. Experiments on three fraud detection datasets demonstrate the superiority of MANDATE.

DCJun 12, 2025Code
HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

Jiaqi Lv, Xufeng He, Yanchen Liu et al.

The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for parallel computing, significantly alleviating computational bottlenecks. Meanwhile, due to the cultivation of user programming habits and the high performance of GPUs, the CUDA ecosystem has established a dominant position in the field of parallel software. This dominance requires other hardware platforms to support CUDA-based software with performance portability. However, translating CUDA code to other platforms poses significant challenges due to differences in parallel programming paradigms and hardware architectures. Existing approaches rely on language extensions, domain-specific languages (DSLs), or compilers but face limitations in workload coverage and generalizability. Moreover, these methods often incur substantial development costs. Recently, LLMs have demonstrated extraordinary potential in various vertical domains, especially in code-related tasks. However, the performance of existing LLMs in CUDA transpilation, particularly for high-performance code, remains suboptimal. To address these challenges, we propose a novel framework for generating high-performance CUDA and corresponding platform code pairs, leveraging AI compiler and automatic optimization technology. We further enhance the framework with a graph-based data augmentation method and introduce HPCTransEval, a benchmark for evaluating LLM performance on CUDA transpilation. We conduct experiments using CUDA-to-CPU transpilation as a case study on leading LLMs. The speedup ratio of the CPU operators has an average improvemnet of 43.8\%, highlighting the potential of LLMs to address compatibility challenges within the CUDA ecosystem. Our code is available at https://github.com/PJLAB-CHIP/HPCTransCompile.

ARApr 8
ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis

Runkai Li, Jia Xiong, Xiuyuan He et al.

High-Level Synthesis (HLS) improves IC development productivity by enabling hardware design from C-like languages. However, strict coding constraints and design-specific optimizations limit its widespread adoption. While recent efforts employ large language models (LLMs) to assist HLS design, they often struggle with synthesizability rules and directive semantics. To this end, we introduce ChatHLS, a multi-agent HLS design framework that leverages specialized LLMs for automated debugging and directive tuning. ChatHLS incorporates an adaptive error case expansion mechanism, combined with a reasoning-to-instruction analysis method to accurately diagnose HLS errors. To optimize hardware performance, it enables QoR-aware reasoning to learn the impact of HLS directives on the quality of results (QoR). Experimental results demonstrate that ChatHLS outperforms Gemini-3-pro with a 32.6% relative improvement in debugging, while achieving significant speedups across various HLS kernels and neural network accelerators. These results underscore the potential of ChatHLS for agile hardware development.

LGMar 8
A Unified Framework for Knowledge Transfer in Bidirectional Model Scaling

Jianlu Shen, Fu Feng, Jiaze Xu et al.

Transferring pre-trained knowledge from a source model to a target model of a different architectural size is a key challenge for flexible and efficient model scaling. However, current parameter-space methods treat Small-to-Large (S2L) and Large-to-Small (L2S) scaling as separate, incompatible problems, focusing on parameter synthesis and selection, respectively. This fragmented perspective has resulted in specialized tools, hindering a unified, bidirectional framework. In this paper, we propose BoT (Bidirectional knowledge Transfer), the first size-agnostic framework to unify S2L and L2S scaling. Our core insight is to treat model weights as continuous signals, where models of different sizes represent distinct discretizations of the transferable knowledge. This multi-resolution perspective directly casts S2L and L2S scaling as the signal processing operations of upsampling and downsampling, naturally leading to the adoption of the Discrete Wavelet Transform (DWT) and its Inverse (IDWT). BoT leverages the recursive nature of wavelets, using the decomposition level as a dynamic scaling factor to bridge disparate model sizes in a parameter-free and computationally efficient manner. Extensive experiments on DeiT, BERT, and GPT demonstrate significant pre-training FLOPs savings (up to 67.1% for S2L, 52.8% for L2S) and state-of-the-art performance on benchmarks like GLUE and SQuAD.

LGApr 16, 2025
FedEPA: Enhancing Personalization and Modality Alignment in Multimodal Federated Learning

Yu Zhang, Qingfeng Du, Jiaqi Lv

Federated Learning (FL) enables decentralized model training across multiple parties while preserving privacy. However, most FL systems assume clients hold only unimodal data, limiting their real-world applicability, as institutions often possess multimodal data. Moreover, the lack of labeled data further constrains the performance of most FL methods. In this work, we propose FedEPA, a novel FL framework for multimodal learning. FedEPA employs a personalized local model aggregation strategy that leverages labeled data on clients to learn personalized aggregation weights, thereby alleviating the impact of data heterogeneity. We also propose an unsupervised modality alignment strategy that works effectively with limited labeled data. Specifically, we decompose multimodal features into aligned features and context features. We then employ contrastive learning to align the aligned features across modalities, ensure the independence between aligned features and context features within each modality, and promote the diversity of context features. A multimodal feature fusion strategy is introduced to obtain a joint embedding. The experimental results show that FedEPA significantly outperforms existing FL methods in multimodal classification tasks under limited labeled data conditions.

IVJan 21, 2025
Deep Learning Based Segmentation of Blood Vessels from H&E Stained Oesophageal Adenocarcinoma Whole-Slide Images

Jiaqi Lv, Stefan S Antonowicz, Shan E Ahmed Raza

Blood vessels (BVs) play a critical role in the Tumor Micro-Environment (TME), potentially influencing cancer progression and treatment response. However, manually quantifying BVs in Hematoxylin and Eosin (H&E) stained images is challenging and labor-intensive due to their heterogeneous appearances. We propose a novel approach of constructing guiding maps to improve the performance of state-of-the-art segmentation models for BV segmentation, the guiding maps encourage the models to learn representative features of BVs. This is particularly beneficial for computational pathology, where labeled training data is often limited and large models are prone to overfitting. We have quantitative and qualitative results to demonstrate the efficacy of our approach in improving segmentation accuracy. In future, we plan to validate this method to segment BVs across various tissue types and investigate the role of cellular structures in relation to BVs in the TME.

LGMar 8
One-for-All Model Initialization with Frequency-Domain Knowledge

Jianlu Shen, Fu Feng, Yucheng Xie et al.

Transferring knowledge by fine-tuning large-scale pre-trained networks has become a standard paradigm for downstream tasks, yet the knowledge of a pre-trained model is tightly coupled with monolithic architecture, which restricts flexible reuse across models of varying scales. In response to this challenge, recent approaches typically resort to either parameter selection, which fails to capture the interdependent structure of this knowledge, or parameter prediction using generative models that depend on impractical access to large network collections. In this paper, we empirically demonstrate that a model's foundational, task-agnostic knowledge, its "learngene", is encoded within the low-frequency components of its weights, and can be efficiently inherited by downstream models. Based on this insight, we propose FRONT (FRequency dOmain kNowledge Transfer), a novel framework that uses the Discrete Cosine Transform (DCT) to isolate the low-frequency "learngene". This learngene can be seamlessly adapted to initialize models of arbitrary size via simple truncation or padding, a process that is entirely training-free. For enhanced performance, we propose an optional low-cost refinement process that introduces a spectral regularizer to further improve the learngene's transferability. Extensive experiments demonstrate that FRONT achieves the state-of-the-art performance, accelerates convergence by up to 15 times in vision tasks, and reduces training FLOPs by an average of 40.5% in language tasks.

CVSep 29, 2025
TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

Zhifang Zhang, Qiqi Tao, Jiaqi Lv et al.

Large vision-language models (LVLMs) have achieved impressive performance across a wide range of vision-language tasks, while they remain vulnerable to backdoor attacks. Existing backdoor attacks on LVLMs aim to force the victim model to generate a predefined target pattern, which is either inserted into or replaces the original content. We find that these fixed-pattern attacks are relatively easy to detect, because the attacked LVLM tends to memorize such frequent patterns in the training dataset, thereby exhibiting overconfidence on these targets given poisoned inputs. To address these limitations, we introduce TokenSwap, a more evasive and stealthy backdoor attack that focuses on the compositional understanding capabilities of LVLMs. Instead of enforcing a fixed targeted content, TokenSwap subtly disrupts the understanding of object relationships in text. Specifically, it causes the backdoored model to generate outputs that mention the correct objects in the image but misrepresent their relationships (i.e., bags-of-words behavior). During training, TokenSwap injects a visual trigger into selected samples and simultaneously swaps the grammatical roles of key tokens in the corresponding textual answers. However, the poisoned samples exhibit only subtle differences from the original ones, making it challenging for the model to learn the backdoor behavior. To address this, TokenSwap employs an adaptive token-weighted loss that explicitly emphasizes the learning of swapped tokens, such that the visual triggers and bags-of-words behavior are associated. Extensive experiments demonstrate that TokenSwap achieves high attack success rates while maintaining superior evasiveness and stealthiness across multiple benchmarks and various LVLM architectures.

LGSep 5, 2025
ModalSurv: A Multimodal Deep Survival Framework for Prostate and Bladder Cancer

Noorul Wahab, Ethar Alzaid, Jiaqi Lv et al.

Accurate prediction of time-to-event outcomes is a central challenge in oncology, with significant implications for treatment planning and patient management. In this work, we present ModaliSurv, a multimodal deep survival model utilising DeepHit with a projection layer and inter-modality cross-attention, which integrates heterogeneous patient data, including clinical, MRI, RNA-seq and whole-slide pathology features. The model is designed to capture complementary prognostic signals across modalities and estimate individualised time-to-biochemical recurrence in prostate cancer and time-to-cancer recurrence in bladder cancer. Our approach was evaluated in the context of the CHIMERA Grand Challenge, across two of the three provided tasks. For Task 1 (prostate cancer bio-chemical recurrence prediction), the proposed framework achieved a concordance index (C-index) of 0.843 on 5-folds cross-validation and 0.818 on CHIMERA development set, demonstrating robust discriminatory ability. For Task 3 (bladder cancer recurrence prediction), the model obtained a C-index of 0.662 on 5-folds cross-validation and 0.457 on development set, highlighting its adaptability and potential for clinical translation. These results suggest that leveraging multimodal integration with deep survival learning provides a promising pathway toward personalised risk stratification in prostate and bladder cancer. Beyond the challenge setting, our framework is broadly applicable to survival prediction tasks involving heterogeneous biomedical data.

IVAug 28, 2025
MitoDetect++: A Domain-Robust Pipeline for Mitosis Detection and Atypical Subtyping

Esha Sadia Nasir, Jiaqi Lv, Mostafa Jahanifar et al.

Automated detection and classification of mitotic figures especially distinguishing atypical from normal remain critical challenges in computational pathology. We present MitoDetect++, a unified deep learning pipeline designed for the MIDOG 2025 challenge, addressing both mitosis detection and atypical mitosis classification. For detection (Track 1), we employ a U-Net-based encoder-decoder architecture with EfficientNetV2-L as the backbone, enhanced with attention modules, and trained via combined segmentation losses. For classification (Track 2), we leverage the Virchow2 vision transformer, fine-tuned efficiently using Low-Rank Adaptation (LoRA) to minimize resource consumption. To improve generalization and mitigate domain shifts, we integrate strong augmentations, focal loss, and group-aware stratified 5-fold cross-validation. At inference, we deploy test-time augmentation (TTA) to boost robustness. Our method achieves a balanced accuracy of 0.892 across validation domains, highlighting its clinical applicability and scalability across tasks.

IVJul 18, 2025
Leveraging Pathology Foundation Models for Panoptic Segmentation of Melanoma in H&E Images

Jiaqi Lv, Yijie Zhu, Carmen Guadalupe Colin Tenorio et al.

Melanoma is an aggressive form of skin cancer with rapid progression and high metastatic potential. Accurate characterisation of tissue morphology in melanoma is crucial for prognosis and treatment planning. However, manual segmentation of tissue regions from haematoxylin and eosin (H&E) stained whole-slide images (WSIs) is labour-intensive and prone to inter-observer variability, this motivates the need for reliable automated tissue segmentation methods. In this study, we propose a novel deep learning network for the segmentation of five tissue classes in melanoma H&E images. Our approach leverages Virchow2, a pathology foundation model trained on 3.1 million histopathology images as a feature extractor. These features are fused with the original RGB images and subsequently processed by an encoder-decoder segmentation network (Efficient-UNet) to produce accurate segmentation maps. The proposed model achieved first place in the tissue segmentation task of the PUMA Grand Challenge, demonstrating robust performance and generalizability. Our results show the potential and efficacy of incorporating pathology foundation models into segmentation networks to accelerate computational pathology workflows.

LGApr 27, 2025
Harmonizing Generalization and Personalization in Ring-topology Decentralized Federated Learning

Shunxin Guo, Jiaqi Lv, Xin Geng

We introduce Ring-topology Decentralized Federated Learning (RDFL) for distributed model training, aiming to avoid the inherent risks of centralized failure in server-based FL. However, RDFL faces the challenge of low information-sharing efficiency due to the point-to-point communication manner when handling inherent data heterogeneity. Existing studies to mitigate data heterogeneity focus on personalized optimization of models, ignoring that the lack of shared information constraints can lead to large differences among models, weakening the benefits of collaborative learning. To tackle these challenges, we propose a Divide-and-conquer RDFL framework (DRDFL) that uses a feature generation model to extract personalized information and invariant shared knowledge from the underlying data distribution, ensuring both effective personalization and strong generalization. Specifically, we design a \textit{PersonaNet} module that encourages class-specific feature representations to follow a Gaussian mixture distribution, facilitating the learning of discriminative latent representations tailored to local data distributions. Meanwhile, the \textit{Learngene} module is introduced to encapsulate shared knowledge through an adversarial classifier to align latent representations and extract globally invariant information. Extensive experiments demonstrate that DRDFL outperforms state-of-the-art methods in various data heterogeneity settings.

DCApr 20, 2025
GENE-FL: Gene-Driven Parameter-Efficient Dynamic Federated Learning

Shunxin Guo, Jiaqi Lv, Qiufeng Wang et al.

Real-world \underline{F}ederated \underline{L}earning systems often encounter \underline{D}ynamic clients with \underline{A}gnostic and highly heterogeneous data distributions (DAFL), which pose challenges for efficient communication and model initialization. To address these challenges, we draw inspiration from the recently proposed Learngene paradigm, which compresses the large-scale model into lightweight, cross-task meta-information fragments. Learngene effectively encapsulates and communicates core knowledge, making it particularly well-suited for DAFL, where dynamic client participation requires communication efficiency and rapid adaptation to new data distributions. Based on this insight, we propose a Gene-driven parameter-efficient dynamic Federated Learning (GENE-FL) framework. First, local models perform quadratic constraints based on parameters with high Fisher values in the global model, as these parameters are considered to encapsulate generalizable knowledge. Second, we apply the strategy of parameter sensitivity analysis in local model parameters to condense the \textit{learnGene} for interaction. Finally, the server aggregates these small-scale trained \textit{learnGene}s into a robust \textit{learnGene} with cross-task generalization capability, facilitating the rapid initialization of dynamic agnostic client models. Extensive experimental results demonstrate that GENE-FL reduces \textbf{4 $\times$} communication costs compared to FEDAVG and effectively initializes agnostic client models with only about \textbf{9.04} MB.

CVMay 10, 2023
Towards Effective Visual Representations for Partial-Label Learning

Shiyu Xia, Jiaqi Lv, Ning Xu et al.

Under partial-label learning (PLL) where, for each training instance, only a set of ambiguous candidate labels containing the unknown true label is accessible, contrastive learning has recently boosted the performance of PLL on vision tasks, attributed to representations learned by contrasting the same/different classes of entities. Without access to true labels, positive points are predicted using pseudo-labels that are inherently noisy, and negative points often require large batches or momentum encoders, resulting in unreliable similarity information and a high computational overhead. In this paper, we rethink a state-of-the-art contrastive PLL method PiCO[24], inspiring the design of a simple framework termed PaPi (Partial-label learning with a guided Prototypical classifier), which demonstrates significant scope for improvement in representation learning, thus contributing to label disambiguation. PaPi guides the optimization of a prototypical classifier by a linear classifier with which they share the same feature encoder, thus explicitly encouraging the representation to reflect visual similarity between categories. It is also technically appealing, as PaPi requires only a few components in PiCO with the opposite direction of guidance, and directly eliminates the contrastive learning module that would introduce noise and consume computational resources. We empirically demonstrate that PaPi significantly outperforms other PLL methods on various image classification tasks.

LGDec 23, 2021
Learning with Proper Partial Labels

Zhenguo Wu, Jiaqi Lv, Masashi Sugiyama

Partial-label learning is a kind of weakly-supervised learning with inexact labels, where for each training example, we are given a set of candidate labels instead of only one true label. Recently, various approaches on partial-label learning have been proposed under different generation models of candidate label sets. However, these methods require relatively strong distributional assumptions on the generation models. When the assumptions do not hold, the performance of the methods is not guaranteed theoretically. In this paper, we propose the notion of properness on partial labels. We show that this proper partial-label learning framework requires a weaker distributional assumption and includes many previous partial-label learning settings as special cases. We then derive a unified unbiased estimator of the classification risk. We prove that our estimator is risk-consistent, and we also establish an estimation error bound. Finally, we validate the effectiveness of our algorithm through experiments.

LGJun 11, 2021
On the Robustness of Average Losses for Partial-Label Learning

Jiaqi Lv, Biao Liu, Lei Feng et al.

Partial-label learning (PLL) utilizes instances with PLs, where a PL includes several candidate labels but only one is the true label (TL). In PLL, identification-based strategy (IBS) purifies each PL on the fly to select the (most likely) TL for training; average-based strategy (ABS) treats all candidate labels equally for training and let trained models be able to predict TL. Although PLL research has focused on IBS for better performance, ABS is also worthy of study since modern IBS behaves like ABS in the beginning of training to prepare for PL purification and TL selection. In this paper, we analyze why ABS was unsatisfactory and propose how to improve it. Theoretically, we formalize five problem settings of PLL and prove that average PL losses (APLLs) with bounded multi-class losses are always robust, while APLLs with unbounded losses may be non-robust, which is the first robustness analysis for PLL. Experimentally, we have two promising findings: ABS using bounded losses can match/exceed state-of-the-art performance of IBS using unbounded losses; after using robust APLLs to warm start, IBS can further improve upon itself. Our work draws attention to ABS research, which can in turn boost IBS and push forward the whole PLL.

LGSep 18, 2020
Compact Learning for Multi-Label Classification

Jiaqi Lv, Tianran Wu, Chenglun Peng et al.

Multi-label classification (MLC) studies the problem where each instance is associated with multiple relevant labels, which leads to the exponential growth of output space. MLC encourages a popular framework named label compression (LC) for capturing label dependency with dimension reduction. Nevertheless, most existing LC methods failed to consider the influence of the feature space or misguided by original problematic features, so that may result in performance degeneration. In this paper, we present a compact learning (CL) framework to embed the features and labels simultaneously and with mutual guidance. The proposal is a versatile concept, hence the embedding way is arbitrary and independent of the subsequent learning process. Following its spirit, a simple yet effective implementation called compact multi-label learning (CMLL) is proposed to learn a compact low-dimensional representation for both spaces. CMLL maximizes the dependence between the embedded spaces of the labels and features, and minimizes the loss of label space recovery concurrently. Theoretically, we provide a general analysis for different embedding methods. Practically, we conduct extensive experiments to validate the effectiveness of the proposed method.

LGJul 17, 2020
Provably Consistent Partial-Label Learning

Lei Feng, Jiaqi Lv, Bo Han et al.

Partial-label learning (PLL) is a multi-class classification problem, where each training example is associated with a set of candidate labels. Even though many practical PLL methods have been proposed in the last two decades, there lacks a theoretical understanding of the consistency of those methods-none of the PLL methods hitherto possesses a generation process of candidate label sets, and then it is still unclear why such a method works on a specific dataset and when it may fail given a different dataset. In this paper, we propose the first generation model of candidate label sets, and develop two novel PLL methods that are guaranteed to be provably consistent, i.e., one is risk-consistent and the other is classifier-consistent. Our methods are advantageous, since they are compatible with any deep network or stochastic optimizer. Furthermore, thanks to the generation model, we would be able to answer the two questions above by testing if the generation model matches given candidate label sets. Experiments on benchmark and real-world datasets validate the effectiveness of the proposed generation model and two PLL methods.

LGFeb 19, 2020
Progressive Identification of True Labels for Partial-Label Learning

Jiaqi Lv, Miao Xu, Lei Feng et al.

Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label. Most existing methods elaborately designed learning objectives as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. The goal of this paper is to propose a novel framework of PLL with flexibility on the model and optimization algorithm. More specifically, we propose a novel estimator of the classification risk, theoretically analyze the classifier-consistency, and establish an estimation error bound. Then we propose a progressive identification algorithm for approximately minimizing the proposed risk estimator, where the update of the model and identification of true labels are conducted in a seamless manner. The resulting algorithm is model-independent and loss-independent, and compatible with stochastic optimization. Thorough experiments demonstrate it sets the new state of the art.