CVJun 16, 2022
Adversarial Patch Attacks and Defences in Vision-Based Tasks: A SurveyAbhijith Sharma, Yijun Bian, Phil Munz et al.
Adversarial attacks in deep learning models, especially for safety-critical systems, are gaining more and more attention in recent years, due to the lack of trust in the security and robustness of AI models. Yet the more primitive adversarial attacks might be physically infeasible or require some resources that are hard to access like the training data, which motivated the emergence of patch attacks. In this survey, we provide a comprehensive overview to cover existing techniques of adversarial patch attacks, aiming to help interested researchers quickly catch up with the progress in this field. We also discuss existing techniques for developing detection and defences against adversarial patches, aiming to help the community better understand this field and its applications in the real world.
LGJan 25, 2023
Increasing Fairness via Combination with Learning GuaranteesYijun Bian, Kun Zhang
The concern about hidden discrimination in ML models is growing, as their widespread real-world application increasingly impacts human lives. Various techniques, including commonly used group fairness measures and several fairness-aware ensemble-based methods, have been developed to enhance fairness. However, existing fairness measures typically focus on only one aspect -- either group or individual fairness, and the hard compatibility among them indicates a possibility of remaining biases even when one of them is satisfied. Moreover, existing mechanisms to boost fairness usually present empirical results to show validity, yet few of them discuss whether fairness can be boosted with certain theoretical guarantees. To address these issues, we propose a fairness quality measure named 'discriminative risk (DR)' to reflect both individual and group fairness aspects. Furthermore, we investigate its properties and establish the first- and second-order oracle bounds to show that fairness can be boosted via ensemble combination with theoretical learning guarantees. The analysis is suitable for both binary and multi-class classification. A pruning method is also proposed to utilise our proposed measure and comprehensive experiments are conducted to evaluate the effectiveness of the proposed methods.
LGAug 12, 2024
Approximating Discrimination Within Models When Faced With Several Non-Binary Sensitive AttributesYijun Bian, Yujie Luo, Ping Xu
Discrimination mitigation within machine learning (ML) models could be complicated because multiple factors may be interwoven hierarchically and historically. Yet few existing fairness measures can capture the discrimination level within ML models in the face of multiple sensitive attributes (SAs). To bridge this gap, we propose a fairness measure based on distances between sets from a manifold perspective, named as 'Harmonic Fairness measure via Manifolds (HFM)' with two optional versions, which can deal with a fine-grained discrimination evaluation for several SAs of multiple values. Because directly computing HFM may be costly, to accelerate its subprocedure -- the computation of distances of sets, we further propose two approximation algorithms named 'Approximation of distance between sets for one sensitive attribute with multiple values (ApproxDist)' and 'Approximation of extended distance between sets for several sensitive attributes with multiple values (ExtendDist)' to respectively resolve bias evaluation of one single SA with multiple values and that of several SAs with multiple values. Moreover, we provide an algorithmic effectiveness analysis for ApproxDist under certain assumptions to explain how well it could work. The empirical results demonstrate that our proposed fairness measure HFM is valid and approximation algorithms (i.e. ApproxDist and ExtendDist) are effective and efficient.
LGMay 15, 2024
Does Machine Bring in Extra Bias in Learning? Approximating Fairness in Models PromptlyYijun Bian, Yujie Luo
Providing various machine learning (ML) applications in the real world, concerns about discrimination hidden in ML models are growing, particularly in high-stakes domains. Existing techniques for assessing the discrimination level of ML models include commonly used group and individual fairness measures. However, these two types of fairness measures are usually hard to be compatible with each other, and even two different group fairness measures might be incompatible as well. To address this issue, we investigate to evaluate the discrimination level of classifiers from a manifold perspective and propose a "harmonic fairness measure via manifolds (HFM)" based on distances between sets. Yet the direct calculation of distances might be too expensive to afford, reducing its practical applicability. Therefore, we devise an approximation algorithm named "Approximation of distance between sets (ApproxDist)" to facilitate accurate estimation of distances, and we further demonstrate its algorithmic effectiveness under certain reasonable assumptions. Empirical results indicate that the proposed fairness measure HFM is valid and that the proposed ApproxDist is effective and efficient.
LGJun 14, 2025
Algorithmic Fairness: Not a Purely Technical but Socio-Technical PropertyYijun Bian, Lei You, Yuya Sasaki et al.
The rapid trend of deploying artificial intelligence (AI) and machine learning (ML) systems in socially consequential domains has raised growing concerns about their trustworthiness, including potential discriminatory behaviours. Research in algorithmic fairness has generated a proliferation of mathematical definitions and metrics, yet persistent misconceptions and limitations -- both within and beyond the fairness community -- limit their effectiveness, such as an unreached consensus on its understanding, prevailing measures primarily tailored to binary group settings, and superficial handling for intersectional contexts. Here we critically remark on these misconceptions and argue that fairness cannot be reduced to purely technical constraints on models; we also examine the limitations of existing fairness measures through conceptual analysis and empirical illustrations, showing their limited applicability in the face of complex real-world scenarios, challenging prevailing views on the incompatibility between accuracy and fairness as well as that among fairness measures themselves, and outlining three worth-considering principles in the design of fairness measures. We believe these findings will help bridge the gap between technical formalisation and social realities and meet the challenges of real-world AI/ML deployment.
LGMar 5, 2025
Towards Trustworthy Federated LearningAlina Basharat, Yijun Bian, Ping Xu et al.
This paper develops a comprehensive framework to address three critical trustworthy challenges in federated learning (FL): robustness against Byzantine attacks, fairness, and privacy preservation. To improve the system's defense against Byzantine attacks that send malicious information to bias the system's performance, we develop a Two-sided Norm Based Screening (TNBS) mechanism, which allows the central server to crop the gradients that have the l lowest norms and h highest norms. TNBS functions as a screening tool to filter out potential malicious participants whose gradients are far from the honest ones. To promote egalitarian fairness, we adopt the q-fair federated learning (q-FFL). Furthermore, we adopt a differential privacy-based scheme to prevent raw data at local clients from being inferred by curious parties. Convergence guarantees are provided for the proposed framework under different scenarios. Experimental results on real datasets demonstrate that the proposed framework effectively improves robustness and fairness while managing the trade-off between privacy and accuracy. This work appears to be the first study that experimentally and theoretically addresses fairness, privacy, and robustness in trustworthy FL.
LGMar 11, 2024
Advancing Graph Neural Networks with HL-HGAT: A Hodge-Laplacian and Attention Mechanism Approach for Heterogeneous Graph-Structured DataJinghan Huang, Qiufeng Chen, Yijun Bian et al.
Graph neural networks (GNNs) have proven effective in capturing relationships among nodes in a graph. This study introduces a novel perspective by considering a graph as a simplicial complex, encompassing nodes, edges, triangles, and $k$-simplices, enabling the definition of graph-structured data on any $k$-simplices. Our contribution is the Hodge-Laplacian heterogeneous graph attention network (HL-HGAT), designed to learn heterogeneous signal representations across $k$-simplices. The HL-HGAT incorporates three key components: HL convolutional filters (HL-filters), simplicial projection (SP), and simplicial attention pooling (SAP) operators, applied to $k$-simplices. HL-filters leverage the unique topology of $k$-simplices encoded by the Hodge-Laplacian (HL) operator, operating within the spectral domain of the $k$-th HL operator. To address computation challenges, we introduce a polynomial approximation for HL-filters, exhibiting spatial localization properties. Additionally, we propose a pooling operator to coarsen $k$-simplices, combining features through simplicial attention mechanisms of self-attention and cross-attention via transformers and SP operators, capturing topological interconnections across multiple dimensions of simplices. The HL-HGAT is comprehensively evaluated across diverse graph applications, including NP-hard problems, graph multi-label and classification challenges, and graph regression tasks in logistics, computer vision, biology, chemistry, and neuroscience. The results demonstrate the model's efficacy and versatility in handling a wide range of graph-based scenarios.
LGOct 30, 2019
When does Diversity Help Generalization in Classification Ensembles?Yijun Bian, Huanhuan Chen
Ensembles, as a widely used and effective technique in the machine learning community, succeed within a key element -- "diversity." The relationship between diversity and generalization, unfortunately, is not entirely understood and remains an open research issue. To reveal the effect of diversity on the generalization of classification ensembles, we investigate three issues on diversity, i.e., the measurement of diversity, the relationship between the proposed diversity and the generalization error, and the utilization of this relationship for ensemble pruning. In the diversity measurement, we measure diversity by error decomposition inspired by regression ensembles, which decomposes the error of classification ensembles into accuracy and diversity. Then we formulate the relationship between the measured diversity and ensemble performance through the theorem of margin and generalization and observe that the generalization error is reduced effectively only when the measured diversity is increased in a few specific ranges, while in other ranges larger diversity is less beneficial to increasing the generalization of an ensemble. Besides, we propose two pruning methods based on diversity management to utilize this relationship, which could increase diversity appropriately and shrink the size of the ensemble without much-decreasing performance. Empirical results validate the reasonableness of the proposed relationship between diversity and ensemble generalization error and the effectiveness of the proposed pruning methods.
LGOct 1, 2019
Sub-Architecture Ensemble Pruning in Neural Architecture SearchYijun Bian, Qingquan Song, Mengnan Du et al.
Neural architecture search (NAS) is gaining more and more attention in recent years due to its flexibility and remarkable capability to reduce the burden of neural network design. To achieve better performance, however, the searching process usually costs massive computations that might not be affordable for researchers and practitioners. While recent attempts have employed ensemble learning methods to mitigate the enormous computational cost, however, they neglect a key property of ensemble methods, namely diversity, which leads to collecting more similar sub-architectures with potential redundancy in the final design. To tackle this problem, we propose a pruning method for NAS ensembles called "Sub-Architecture Ensemble Pruning in Neural Architecture Search (SAEP)." It targets to leverage diversity and to achieve sub-ensemble architectures at a smaller size with comparable performance to ensemble architectures that are not pruned. Three possible solutions are proposed to decide which sub-architectures to prune during the searching process. Experimental results exhibit the effectiveness of the proposed method by largely reducing the number of sub-architectures without degrading the performance.
LGJun 13, 2018
Ensemble Pruning based on Objection Maximization with a General Distributed FrameworkYijun Bian, Yijun Wang, Yaqiang Yao et al.
Ensemble pruning, selecting a subset of individual learners from an original ensemble, alleviates the deficiencies of ensemble learning on the cost of time and space. Accuracy and diversity serve as two crucial factors while they usually conflict with each other. To balance both of them, we formalize the ensemble pruning problem as an objection maximization problem based on information entropy. Then we propose an ensemble pruning method including a centralized version and a distributed version, in which the latter is to speed up the former. At last, we extract a general distributed framework for ensemble pruning, which can be widely suitable for most of the existing ensemble pruning methods and achieve less time consuming without much accuracy degradation. Experimental results validate the efficiency of our framework and methods, particularly concerning a remarkable improvement of the execution speed, accompanied by gratifying accuracy performance.