Bow-Yaw Wang

CY
h-index1
6papers
97citations
Novelty51%
AI Score33

6 Papers

CYAug 24, 2024Code
Ensuring Fairness with Transparent Auditing of Quantitative Bias in AI Systems

Chih-Cheng Rex Yuan, Bow-Yaw Wang

With the rapid advancement of AI, there is a growing trend to integrate AI into decision-making processes. However, AI systems may exhibit biases that lead decision-makers to draw unfair conclusions. Notably, the COMPAS system used in the American justice system to evaluate recidivism was found to favor racial majority groups; specifically, it violates a fairness standard called equalized odds. Various measures have been proposed to assess AI fairness. We present a framework for auditing AI fairness, involving third-party auditors and AI system providers, and we have created a tool to facilitate systematic examination of AI systems. The tool is open-sourced and publicly available. Unlike traditional AI systems, we advocate a transparent white-box and statistics-based approach. It can be utilized by third-party auditors, AI developers, or the general public for reference when judging the fairness criterion of AI systems.

CYApr 30, 2025
Quantitative Auditing of AI Fairness with Differentially Private Synthetic Data

Chih-Cheng Rex Yuan, Bow-Yaw Wang

Fairness auditing of AI systems can identify and quantify biases. However, traditional auditing using real-world data raises security and privacy concerns. It exposes auditors to security risks as they become custodians of sensitive information and targets for cyberattacks. Privacy risks arise even without direct breaches, as data analyses can inadvertently expose confidential information. To address these, we propose a framework that leverages differentially private synthetic data to audit the fairness of AI systems. By applying privacy-preserving mechanisms, it generates synthetic data that mirrors the statistical properties of the original dataset while ensuring privacy. This method balances the goal of rigorous fairness auditing and the need for strong privacy protections. Through experiments on real datasets like Adult, COMPAS, and Diabetes, we compare fairness metrics of synthetic and real data. By analyzing the alignment and discrepancies between these metrics, we assess the capacity of synthetic data to preserve the fairness properties of real data. Our results demonstrate the framework's ability to enable meaningful fairness evaluations while safeguarding sensitive information, proving its applicability across critical and sensitive domains.

CRAug 4, 2020
Verifying Pufferfish Privacy in Hidden Markov Models

Depeng Liu, Bow-yaw Wang, Lijun Zhang

Pufferfish is a Bayesian privacy framework for designing and analyzing privacy mechanisms. It refines differential privacy, the current gold standard in data privacy, by allowing explicit prior knowledge in privacy analysis. Through these privacy frameworks, a number of privacy mechanisms have been developed in literature. In practice, privacy mechanisms often need be modified or adjusted to specific applications. Their privacy risks have to be re-evaluated for different circumstances. Moreover, computing devices only approximate continuous noises through floating-point computation, which is discrete in nature. Privacy proofs can thus be complicated and prone to errors. Such tedious tasks can be burdensome to average data curators. In this paper, we propose an automatic verification technique for Pufferfish privacy. We use hidden Markov models to specify and analyze discretized Pufferfish privacy mechanisms. We show that the Pufferfish verification problem in hidden Markov models is NP-hard. Using Satisfiability Modulo Theories solvers, we propose an algorithm to analyze privacy requirements. We implement our algorithm in a prototypical tool called FAIER, and present several case studies. Surprisingly, our case studies show that naïve discretization of well-established privacy mechanisms often fail, witnessed by counterexamples generated by FAIER. In discretized \emph{Above Threshold}, we show that it results in absolutely no privacy. Finally, we compare our approach with testing based approach on several case studies, and show that our verification technique can be combined with testing based approach for the purpose of (i) efficiently certifying counterexamples and (ii) obtaining a better lower bound for the privacy budget $ε$.

SENov 3, 2015
PAC Learning-Based Verification and Model Synthesis

Yu-Fang Chen, Chiao Hsieh, Ondřej Lengál et al.

We introduce a novel technique for verification and model synthesis of sequential programs. Our technique is based on learning a regular model of the set of feasible paths in a program, and testing whether this model contains an incorrect behavior. Exact learning algorithms require checking equivalence between the model and the program, which is a difficult problem, in general undecidable. Our learning procedure is therefore based on the framework of probably approximately correct (PAC) learning, which uses sampling instead and provides correctness guarantees expressed using the terms error probability and confidence. Besides the verification result, our procedure also outputs the model with the said correctness guarantees. Obtained preliminary experiments show encouraging results, in some cases even outperforming mature software verifiers.

SEFeb 15, 2015
Counterexample-Guided Polynomial Loop Invariant Generation by Lagrange Interpolation

Yu-Fang Chen, Chih-Duo Hong, Bow-Yaw Wang et al.

We apply multivariate Lagrange interpolation to synthesize polynomial quantitative loop invariants for probabilistic programs. We reduce the computation of an quantitative loop invariant to solving constraints over program variables and unknown coefficients. Lagrange interpolation allows us to find constraints with less unknown coefficients. Counterexample-guided refinement furthermore generates linear constraints that pinpoint the desired quantitative invariants. We evaluate our technique by several case studies with polynomial quantitative loop invariants in the experiments.

LOJul 31, 2012
Predicate Generation for Learning-Based Quantifier-Free Loop Invariant Inference

Wonchan Lee, Yungbum Jung, Bow-yaw Wang et al.

We address the predicate generation problem in the context of loop invariant inference. Motivated by the interpolation-based abstraction refinement technique, we apply the interpolation theorem to synthesize predicates implicitly implied by program texts. Our technique is able to improve the effectiveness and efficiency of the learning-based loop invariant inference algorithm in [14]. We report experiment results of examples from Linux, SPEC2000, and Tar utility.