Omkar Dige

CL
h-index4
4papers
29citations
Novelty53%
AI Score35

4 Papers

CLJul 24, 2023Code
On The Role of Reasoning in the Identification of Subtle Stereotypes in Natural Language

Jacob-Junqi Tian, Omkar Dige, D. B. Emerson et al.

Large language models (LLMs) are trained on vast, uncurated datasets that contain various forms of biases and language reinforcing harmful stereotypes that may be subsequently inherited by the models themselves. Therefore, it is essential to examine and address biases in language models, integrating fairness into their development to ensure that these models do not perpetuate social biases. In this work, we demonstrate the importance of reasoning in zero-shot stereotype identification across several open-source LLMs. Accurate identification of stereotypical language is a complex task requiring a nuanced understanding of social structures, biases, and existing unfair generalizations about particular groups. While improved accuracy is observed through model scaling, the use of reasoning, especially multi-step reasoning, is crucial to consistent performance. Additionally, through a qualitative analysis of select reasoning traces, we highlight how reasoning improves not just accuracy, but also the interpretability of model decisions. This work firmly establishes reasoning as a critical component in automatic stereotype detection and is a first step towards stronger stereotype mitigation pipelines for LLMs.

CLJul 19, 2023
Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

Omkar Dige, Jacob-Junqi Tian, David Emerson et al.

As the breadth and depth of language model applications continue to expand rapidly, it is increasingly important to build efficient frameworks for measuring and mitigating the learned or inherited social biases of these models. In this paper, we present our work on evaluating instruction fine-tuned language models' ability to identify bias through zero-shot prompting, including Chain-of-Thought (CoT) prompts. Across LLaMA and its two instruction fine-tuned versions, Alpaca 7B performs best on the bias identification task with an accuracy of 56.7%. We also demonstrate that scaling up LLM size and data diversity could lead to further performance gain. This is a work-in-progress presenting the first component of our bias mitigation framework. We will keep updating this work as we get more results.

CLJun 19, 2024Code
Mitigating Social Biases in Language Models through Unlearning

Omkar Dige, Diljot Singh, Tsz Fung Yau et al.

Mitigating bias in language models (LMs) has become a critical problem due to the widespread deployment of LMs. Numerous approaches revolve around data pre-processing and fine-tuning of language models, tasks that can be both time-consuming and computationally demanding. Consequently, there is a growing interest in machine unlearning techniques given their capacity to induce the forgetting of undesired behaviors of the existing pre-trained or fine-tuned models with lower computational cost. In this work, we explore two unlearning methods, (1) Partitioned Contrastive Gradient Unlearning (PCGU) applied on decoder models and (2) Negation via Task Vector, to reduce social biases in state-of-the-art and open-source LMs such as LLaMA-2 and OPT. We also implement distributed PCGU for large models. It is empirically shown, through quantitative and qualitative analyses, that negation via Task Vector method outperforms PCGU in debiasing with minimum deterioration in performance and perplexity of the models. On LLaMA-27B, negation via Task Vector reduces the bias score by 11.8%

LGMay 22, 2025
Automated Capability Evaluation of Foundation Models

Arash Afkanpour, Omkar Dige, Fatemeh Tavakoli et al.

Current evaluation frameworks for foundation models rely heavily on static, manually curated benchmarks, limiting their ability to capture the full breadth of model capabilities. This paper introduces Active learning for Capability Evaluation (ACE), a novel framework for scalable, automated, and fine-grained evaluation of foundation models. ACE leverages the knowledge embedded in powerful frontier models to decompose a domain into semantically meaningful capabilities and generates diverse evaluation tasks, significantly reducing human effort. In Mathematics, ACE generated 433 capabilities and 11,800 tasks, covering 94% of Wikipedia-defined skills in the domain while introducing novel, coherent ones. To maximize efficiency, ACE fits a capability model in latent semantic space, allowing reliable approximation of a subject model's performance by evaluating only a subset of capabilities via active learning. It reaches within 0.01 RMSE of exhaustive evaluation by evaluating less than half of capabilities. Compared to static datasets, ACE provides more balanced coverage and uncovers fine-grained differences that aggregate metrics fail to capture. Our results demonstrate that ACE provides a more complete and informative picture of model capabilities, which is essential for safe and well-informed deployment of foundation models.