LGApr 6, 2021Code
End-To-End Bias Mitigation: Removing Gender Bias in Deep LearningTal Feldman, Ashley Peake
Machine Learning models have been deployed across many different aspects of society, often in situations that affect social welfare. Although these models offer streamlined solutions to large problems, they may contain biases and treat groups or individuals unfairly based on protected attributes such as gender. In this paper, we introduce several examples of machine learning gender bias in practice followed by formalizations of fairness. We provide a survey of fairness research by detailing influential pre-processing, in-processing, and post-processing bias mitigation algorithms. We then propose an end-to-end bias mitigation framework, which employs a fusion of pre-, in-, and post-processing methods to leverage the strengths of each individual technique. We test this method, along with the standard techniques we review, on a deep neural network to analyze bias mitigation in a deep learning setting. We find that our end-to-end bias mitigation framework outperforms the baselines with respect to several fairness metrics, suggesting its promise as a method for improving fairness. As society increasingly relies on artificial intelligence to help in decision-making, addressing gender biases present in deep learning models is imperative. To provide readers with the tools to assess the fairness of machine learning models and mitigate the biases present in them, we discuss multiple open source packages for fairness in AI.
CRFeb 5
Know Your Scientist: KYC as Biosecurity InfrastructureJonathan Feldman, Tal Feldman, Annie I Anton
Biological AI tools for protein design and structure prediction are advancing rapidly, creating dual-use risks that existing safeguards cannot adequately address. Current model-level restrictions, including keyword filtering, output screening, and content-based access denials, are fundamentally ill-suited to biology, where reliable function prediction remains beyond reach and novel threats evade detection by design. We propose a three-tier Know Your Customer (KYC) framework, inspired by anti-money laundering (AML) practices in the financial sector, that shifts governance from content inspection to user verification and monitoring. Tier I leverages research institutions as trust anchors to vouch for affiliated researchers and assume responsibility for vetting. Tier II applies output screening through sequence homology searches and functional annotation. Tier III monitors behavioral patterns to detect anomalies inconsistent with declared research purposes. This layered approach preserves access for legitimate researchers while raising the cost of misuse through institutional accountability and traceability. The framework can be implemented immediately using existing institutional infrastructure, requiring no new legislation or regulatory mandates.
QMAug 30, 2025
Resilient Biosecurity in the Era of AI-Enabled BioweaponsJonathan Feldman, Tal Feldman
Recent advances in generative biology have enabled the design of novel proteins, creating significant opportunities for drug discovery while also introducing new risks, including the potential development of synthetic bioweapons. Existing biosafety measures primarily rely on inference-time filters such as sequence alignment and protein-protein interaction (PPI) prediction to detect dangerous outputs. In this study, we evaluate the performance of three leading PPI prediction tools: AlphaFold 3, AF3Complex, and SpatialPPIv2. These models were tested on well-characterized viral-host interactions, such as those involving Hepatitis B and SARS-CoV-2. Despite being trained on many of the same viruses, the models fail to detect a substantial number of known interactions. Strikingly, none of the tools successfully identify any of the four experimentally validated SARS-CoV-2 mutants with confirmed binding. These findings suggest that current predictive filters are inadequate for reliably flagging even known biological threats and are even more unlikely to detect novel ones. We argue for a shift toward response-oriented infrastructure, including rapid experimental validation, adaptable biomanufacturing, and regulatory frameworks capable of operating at the speed of AI-driven developments.