Novarun Deb

SE
h-index28
9papers
7citations
Novelty37%
AI Score48

9 Papers

53.9SEApr 3Code
Sustainability Analysis of Prompt Strategies for SLM-based Automated Test Generation

Pragati Kumari, Novarun Deb

The growing adoption of prompt-based automation in software testing raises important issues regarding its computational and environmental sustainability. Existing sustainability studies in AI-driven testing primarily focus on large language models, leaving the impact of prompt engineering strategies largely unexplored - particularly in the context of Small Language Models (SLMs). This gap is critical, as prompt design directly influences inference behavior, execution cost, and resource utilization, even when model size is fixed. To the best of our knowledge, this paper presents the first systematic sustainability evaluation of prompt engineering strategies for automated test generation using SLMs. We analyze seven prompt strategies across three open-source SLMs under a controlled experimental setup. Our evaluation jointly considers execution time, token usage, energy consumption, carbon emissions, and coverage test quality, the latter assessed through coverage analysis of the generated test scripts. The results show that prompt strategies have a substantial and independent impact on sustainability outcomes, often outweighing the effect of model choice. Reasoning intensive strategies such as Chain of Thought and Self-Consistency achieve higher coverage but incur significantly higher execution time, energy consumption, and carbon emissions. In contrast, simpler strategies such as Zero-Shot and ReAct deliver competitive coverage test quality with markedly lower environmental cost, while Least-to-Most and Program of Thought offer balanced trade-offs.

83.8SEApr 3Code
Evaluating the Environmental Impact of using SLMs and Prompt Engineering for Code Generation

Md Afif Al Mamun, Sayan Nath, Gias Uddin et al.

The shift from cloud-hosted Large Language Models (LLMs) to locally deployed open-source Small Language Models (SLMs) has democratized AI-assisted coding; however, it has also decentralized the environmental footprint of AI. While prompting strategies - such as Chain-of-Thought and ReAct - serve as external mechanisms for optimizing code generation without modifying model parameters, their impact on energy consumption and carbon emissions remains largely invisible to developers. This paper presents the first systematic empirical study investigating how different prompt engineering strategies in SLM-based code generation impact code generation accuracy alongside sustainability factors. We evaluate six prominent prompting strategies across 11 open-source models (ranging from 1B to 34B parameters) using the HumanEval+ and MBPP+ benchmarks. By measuring Pass@1 accuracy alongside energy (kWh), carbon emissions (kgCO2eq), and inference latency, we reveal that sustainability often decouples from accuracy, allowing significant environmental optimizations without sacrificing performance. Our findings indicate that Chain-of-Thought, being a simpler prompting technique, can provide a near-optimal balance between reasoning capability and energy efficiency. Conversely, multi-sampling strategies often incur disproportionate costs for marginal gains. Finally, we identify grid carbon intensity as the dominant factor in deployment-time emissions, highlighting the need for practitioners to consider regional energy profiles. This work provides a quantitative foundation for "green" prompt engineering, enabling developers to align high-performance code generation with ecological responsibility.

SEMar 8, 2023
An Annexure to the Paper "Driving the Technology Value Stream by Analyzing App Reviews"

Souvick Das, Novarun Deb, Agostino Cortesi et al.

This paper presents a novel framework that utilizes Natural Language Processing (NLP) techniques to understand user feedback on mobile applications. The framework allows software companies to drive their technology value stream based on user reviews, which can highlight areas for improvement. The framework is analyzed in depth, and its modules are evaluated for their effectiveness. The proposed approach is demonstrated to be effective through an analysis of reviews for sixteen popular Android Play Store applications over a long period of time.

56.6SEApr 3
An Empirical Study of Sustainability in Prompt-driven Test Script Generation Using Small Language Models

Pragati Kumari, Novarun Deb

The increasing use of language models in automated test script generation raises concerns about their environmental impact, yet existing sustainability analyses focus predominantly on large language models. As a result, the energy and carbon characteristics of small language models (SLMs) during prompt-driven unit-test script generation remain largely unexplored. To address this gap, this study empirically examines the environmental and performance tradeoffs of SLMs (in the 2B-8B parameter range) using the HumanEval benchmark and adaptive prompt variants (based on the Anthropic template). The analysis uses CodeCarbon to characterize energy consumption carbon emissions and duration under controlled conditions, with unit-test script coverage serving as an initial proxy for generated test quality. Our results show that different SLMs exhibit distinct sustainability profiles - some favor lower energy use and faster execution, while others maintain higher stability or coverage under comparable conditions. Overall, this work provides focused empirical evidence on sustainable SLM-based test script generation, clarifying how prompt structure and model selection jointly shape environmental and performance outcomes.

33.5SEMar 31
Sustainable AI Assistance Through Digital Sobriety

Madeline Jennings, Novarun Deb, Ronnie de Souza Santos

As AI assistants become commonplace in daily life, the demand for solutions that reduce the cost of inference without sacrificing utility is increasing. Existing work on AI sustainability frequently emphasizes hardware and software optimizations; however, there may be comparable value in social approaches that shape user behavior and discourage unnecessary use. In this study, we operationalize sustainability in terms of energy-efficiency and analyze a publicly sourced sample of prompts where AI is used for assistance in software development. Using this categorization, we find that nearly half of the observed queries can be considered unnecessary relative to their expected benefit. We further observe that factoid-style information retrieval constitutes the largest share of unnecessary requests, suggesting that a meaningful portion of everyday AI usage may be replaceable with lower-cost alternatives (e.g., conventional search or local documentation). These findings motivate a closer examination of how, why, and when AI systems are invoked, and what norms or interface-level nudges might reduce avoidable demand. We conclude with a call to replicate and extend this preliminary analysis and to pay greater attention to the social dimension of AI sustainability.

31.2SEMar 25
Towards Energy-aware Requirements Dependency Classification: Knowledge-Graph vs. Vector-Retrieval Augmented Inference with SLMs

Shreyas Patil, Pragati Kumari, Novarun Deb et al.

The continuous evolution of system specifications necessitates frequent evaluation of conflicting requirements, a process that is traditionally labour intensive. Although large language models (LLMs) have demonstrated significant potential for automating this detection, their massive computational requirements often result in excessive energy waste. Consequently, there is a growing need to transition toward Small Language Models (SLMs) and energy aware architectures for sustainable Requirements Engineering. This study proposes and empirically evaluates an energy aware framework that compares Knowledge Graph-based Retrieval (KGR) with Vector-based Semantic Retrieval (VSR) to enhance SLM-based inference at the 7B to 8B parameter scale. By leveraging structured graph traversal and high dimensional semantic mapping, we extract candidate requirements, which are then classified as conflicting or neutral by an inference engine. We evaluate these retrieval enhanced strategies across Zero-Shot, Few-Shot, and Chain of Thoughts prompting methods. Using a three-pillar sustainability framework measuring energy consumption (Wh), latency (s), and carbon emissions (gCO2eq) alongside standard accuracy metrics (F1 Score), this research provides a first systematic empirical evaluation and trade off analysis between predictive performance and environmental impact. Our findings highlight the effectiveness of structured versus semantic retrieval in detecting requirement conflicts, offering a reproducible, sustainability aware architecture for energy efficient requirement engineering.

SEOct 10, 2025
SEER: Sustainability Enhanced Engineering of Software Requirements

Mandira Roy, Novarun Deb, Nabendu Chaki et al.

The rapid expansion of software development has significant environmental, technical, social, and economic impacts. Achieving the United Nations Sustainable Development Goals by 2030 compels developers to adopt sustainable practices. Existing methods mostly offer high-level guidelines, which are time-consuming to implement and rely on team adaptability. Moreover, they focus on design or implementation, while sustainability assessment should start at the requirements engineering phase. In this paper, we introduce SEER, a framework which addresses sustainability concerns in the early software development phase. The framework operates in three stages: (i) it identifies sustainability requirements (SRs) relevant to a specific software product from a general taxonomy; (ii) it evaluates how sustainable system requirements are based on the identified SRs; and (iii) it optimizes system requirements that fail to satisfy any SR. The framework is implemented using the reasoning capabilities of large language models and the agentic RAG (Retrieval Augmented Generation) approach. SEER has been experimented on four software projects from different domains. Results generated using Gemini 2.5 reasoning model demonstrate the effectiveness of the proposed approach in accurately identifying a broad range of sustainability concerns across diverse domains.

SEMay 12, 2019
AFSCR: Annotation of Functional Satisfaction Conditions and their Reconciliation within i* models

Novarun Deb, Nabendu Chaki

Context: Researchers, both in industry and academia, are facing the challenge of leveraging the benefits of goal oriented requirements engineering (GORE) techniques to business compliance management. This requires analyzing goal models along with their semantics. However, most prominent goal modeling frameworks have no means of capturing the semantics of goals (except what is trivially conveyed by their nomenclature). Objective: In this paper, we propose the Annotation of Functional Satisfaction Conditions and their Reconciliation (AFSCR) framework for doing the same. The entire framework is presented with respect to i* modeling constructs. Method: This is a semi-automated framework that requires analysts to annotate individual goals with their immediate goal satisfaction conditions. The AFSCR framework can then reconcile these satisfaction conditions for every goal and verify whether the derived set of cumulative satisfaction conditions is in harmony with the intended set of goal satisfaction conditions. Result: If the derived and intended sets of satisfaction conditions are in conflict, the framework raises entailment and/or consistency flags. Whenever a conflict is flagged, the framework also provides alternate solutions and possible workaround strategies to the analysts by refactoring the given i* model. Conclusion: In this paper we present a new framework that uses satisfaction conditions for going beyond the nomenclature and capturing the functional semantics of the goals within i* models. The analysis performed during the reconciliation process is generic enough and can be adapted to any goal modeling framework if required.

SEJul 24, 2015
Extracting State Transition Models from i* Models

Novarun Deb, Nabendu Chaki, Aditya Ghose

i* models are inherently sequence agnostic. There is an immediate need to bridge the gap between such a sequence agnostic model and an industry implemented process modelling standard like Business Process Modelling Notation (BPMN). This work is an attempt to build State Transition Models from i* models. In this paper, we first spell out the Naive Algorithm formally, which is on the lines of Formal Tropos. We demonstrate how the growth of the State Transition Model Space can be mapped to the problem of finding the number of possible paths between the Least Upper Bound (LUB) and the Greatest Lower Bound (GLB) of a k-dimensional hypercube Lattice structure. We formally present the mathematics for doing a quantitative analysis of the space growth. The Naive Algorithm has its main drawback in the hyperexponential explosion caused in the State Transition Model space. This is identified and the Semantic Implosion Algorithm is proposed which exploits the temporal information embedded within the i* model of an enterprise to reduce the rate of growth of the State Transition Model space. A comparative quantitative analysis between the two approaches concludes the superiority of the Semantic Implosion Algorithm.