CL AI LGFeb 10, 2023

FairPy: A Toolkit for Evaluation of Prediction Biases and their Mitigation in Large Language Models

arXiv:2302.05508v21.33 citationsh-index: 6Has Code

Originality Synthesis-oriented

AI Analysis

This toolkit addresses bias evaluation and mitigation for users of large language models, but it is incremental as it primarily integrates existing methods.

The paper presents FairPy, a toolkit for evaluating and mitigating prediction biases in large language models like BERT and GPT-2, by providing a modular interface for integrating existing debiasing algorithms and making it publicly available as open-source.

Recent studies have demonstrated that large pretrained language models (LLMs) such as BERT and GPT-2 exhibit biases in token prediction, often inherited from the data distributions present in their training corpora. In response, a number of mathematical frameworks have been proposed to quantify, identify, and mitigate these the likelihood of biased token predictions. In this paper, we present a comprehensive survey of such techniques tailored towards widely used LLMs such as BERT, GPT-2, etc. We additionally introduce Fairpy, a modular and extensible toolkit that provides plug-and-play interfaces for integrating these mathematical tools, enabling users to evaluate both pretrained and custom language models. Fairpy supports the implementation of existing debiasing algorithms. The toolkit is open-source and publicly available at: \href{https://github.com/HrishikeshVish/Fairpy}{https://github.com/HrishikeshVish/Fairpy}

View on arXiv PDF Code

Similar