Sanjay Podder

h-index10

9papers

234citations

Novelty34%

AI Score31

Ranked #130,132 of 194,257 authors (top 67%)#1,447 in SE (top 48%)

9 Papers

3.4SEJun 10, 2025

Do Generative AI Tools Ensure Green Code? An Investigative Study

Samarth Sikand, Rohit Mehra, Vibhu Saujanya Sharma et al.

Software sustainability is emerging as a primary concern, aiming to optimize resource utilization, minimize environmental impact, and promote a greener, more resilient digital ecosystem. The sustainability or "greenness" of software is typically determined by the adoption of sustainable coding practices. With a maturing ecosystem around generative AI, many software developers now rely on these tools to generate code using natural language prompts. Despite their potential advantages, there is a significant lack of studies on the sustainability aspects of AI-generated code. Specifically, how environmentally friendly is the AI-generated code based upon its adoption of sustainable coding practices? In this paper, we present the results of an early investigation into the sustainability aspects of AI-generated code across three popular generative AI tools - ChatGPT, BARD, and Copilot. The results highlight the default non-green behavior of tools for generating code, across multiple rules and scenarios. It underscores the need for further in-depth investigations and effective remediation strategies.

15.5CLJun 10, 2025Code

Brevity is the soul of sustainability: Characterizing LLM response lengths

Soham Poddar, Paramita Koley, Janardan Misra et al.

A significant portion of the energy consumed by Large Language Models (LLMs) arises from their inference processes; hence developing energy-efficient methods for inference is crucial. While several techniques exist for inference optimization, output compression remains relatively unexplored, with only a few preliminary efforts addressing this aspect. In this work, we first benchmark 12 decoder-only LLMs across 5 datasets, revealing that these models often produce responses that are substantially longer than necessary. We then conduct a comprehensive quality assessment of LLM responses, formally defining six information categories present in LLM responses. We show that LLMs often tend to include redundant or additional information besides the minimal answer. To address this issue of long responses by LLMs, we explore several simple and intuitive prompt-engineering strategies. Empirical evaluation shows that appropriate prompts targeting length reduction and controlling information content can achieve significant energy optimization between 25-60\% by reducing the response length while preserving the quality of LLM responses.

7.1LGJun 10, 2025

Breaking the ICE: Exploring promises and challenges of benchmarks for Inference Carbon & Energy estimation for LLMs

Samarth Sikand, Rohit Mehra, Priyavanshi Pathania et al.

While Generative AI stands to be one of the fastest adopted technologies ever, studies have made evident that the usage of Large Language Models (LLMs) puts significant burden on energy grids and our environment. It may prove a hindrance to the Sustainability goals of any organization. A crucial step in any Sustainability strategy is monitoring or estimating the energy consumption of various components. While there exist multiple tools for monitoring energy consumption, there is a dearth of tools/frameworks for estimating the consumption or carbon emissions. Current drawbacks of both monitoring and estimation tools include high input data points, intrusive nature, high error margin, etc. We posit that leveraging emerging LLM benchmarks and related data points can help overcome aforementioned challenges while balancing accuracy of the emission estimations. To that extent, we discuss the challenges of current approaches and present our evolving framework, R-ICE, which estimates prompt level inference carbon emissions by leveraging existing state-of-the-art(SOTA) benchmark. This direction provides a more practical and non-intrusive way to enable emerging use-cases like dynamic LLM routing, carbon accounting, etc. Our promising validation results suggest that benchmark-based modelling holds great potential for inference emission estimation and warrants further exploration from the scientific community.

3.6SEApr 19, 2021

When to Build Quantum Software?

Janardan Misra, Vikrant Kaulgud, Rupesh Kaslay et al.

Despite ongoing advancements in quantum computing, businesses are still faced with the problem to decide if they would benefit from investing into this novel technology for building a business critical application. This uncertainty is not only owing to the limitations in the current state of the technology but also due to the gap between the level at which business applications are analyzed (e.g., using high level semi-formal languages) and the level at which quantum computing related information is currently available (e.g., formally specified computational problems, their algorithmic solutions with computational complexity theoretic analysis) to make informed decisions. To fill the discourse gap, in this paper, we present design of an interactive advisor, which augments users while deciding to invest into quantum software development as a plausible future option in their application context. Towards that we apply business process modeling and natural language similarity analysis using text-embeddings to associated business context with computational problems and formulate constraints in terms of quantum speedup and resource requirements to select software development platforms.

4.8LGJul 13, 2019Code

Metamorphic Testing of a Deep Learning based Forecaster

Anurag Dwarakanath, Manish Ahuja, Sanjay Podder et al.

In this paper, we present the Metamorphic Testing of an in-use deep learning based forecasting application. The application looks at the past data of system characteristics (e.g. `memory allocation') to predict outages in the future. We focus on two statistical / machine learning based components - a) detection of co-relation between system characteristics and b) estimating the future value of a system characteristic using an LSTM (a deep learning architecture). In total, 19 Metamorphic Relations have been developed and we provide proofs & algorithms where applicable. We evaluated our method through two settings. In the first, we executed the relations on the actual application and uncovered 8 issues not known before. Second, we generated hypothetical bugs, through Mutation Testing, on a reference implementation of the LSTM based forecaster and found that 65.9% of the bugs were caught through the relations.

2.7SESep 25, 2018

Machines that test Software like Humans

Anurag Dwarakanath, Neville Dubash, Sanjay Podder

Automated software testing involves the execution of test scripts by a machine instead of being manually run. This significantly reduces the amount of manual time & effort needed and thus is of great interest to the software testing industry. There have been various tools developed to automate the testing of web applications (e.g. Selenium WebDriver); however, the practical adoption of test automation is still miniscule. This is due to the complexity of creating and maintaining automation scripts. The key problem with the existing methods is that the automation test scripts require certain implementation specifics of the Application Under Test (AUT) (e.g. the html code of a web element, or an image of a web element). On the other hand, if we look at the way manual testing is done, the tester interprets the textual test scripts and interacts with the AUT purely based on what he perceives visually through the GUI. In this paper, we present an approach to build a machine that can mimic human behavior for software testing using recent advances in Computer Vision. We also present four use-cases of how this approach can significantly advance the test automation space making test automation simple enough to be adopted practically.

2.7SESep 21, 2018

Accelerating Test Automation through a Domain Specific Language

Anurag Dwarakanath, Dipin Era, Aditya Priyadarshi et al.

Test automation involves the automatic execution of test scripts instead of being manually run. This significantly reduces the amount of manual effort needed and thus is of great interest to the software testing industry. There are two key problems in the existing tools and methods for test automation - a) Creating an automation test script is essentially a code development task, which most testers are not trained on; and b) the automation test script is seldom readable, making the task of maintenance an effort intensive process. We present the Accelerating Test Automation Platform (ATAP) which is aimed at making test automation accessible to non-programmers. ATAP allows the creation of an automation test script through a domain specific language based on English. The English-like test scripts are automatically converted to machine executable code using Selenium WebDriver. ATAP's English-like test script makes it easy for non-programmers to author. The functional flow of an ATAP script is easy to understand as well thus making maintenance simpler (you can understand the flow of the test script when you revisit it many months later). ATAP has been built around the Eclipse ecosystem and has been used in a real-life testing project. We present the details of the implementation of ATAP and the results from its usage in practice.

32.5SEAug 16, 2018

Identifying Implementation Bugs in Machine Learning based Image Classifiers using Metamorphic Testing

Anurag Dwarakanath, Manish Ahuja, Samarth Sikand et al.

We have recently witnessed tremendous success of Machine Learning (ML) in practical applications. Computer vision, speech recognition and language translation have all seen a near human level performance. We expect, in the near future, most business applications will have some form of ML. However, testing such applications is extremely challenging and would be very expensive if we follow today's methodologies. In this work, we present an articulation of the challenges in testing ML based applications. We then present our solution approach, based on the concept of Metamorphic Testing, which aims to identify implementation bugs in ML based image classifiers. We have developed metamorphic relations for an application based on Support Vector Machine and a Deep Learning based application. Empirical validation showed that our approach was able to catch 71% of the implementation bugs in the ML applications.

2.3CLNov 15, 2016

A Neural Architecture Mimicking Humans End-to-End for Natural Language Inference

Biswajit Paria, K. M. Annervaz, Ambedkar Dukkipati et al.

In this work we use the recent advances in representation learning to propose a neural architecture for the problem of natural language inference. Our approach is aligned to mimic how a human does the natural language inference process given two statements. The model uses variants of Long Short Term Memory (LSTM), attention mechanism and composable neural networks, to carry out the task. Each part of our model can be mapped to a clear functionality humans do for carrying out the overall task of natural language inference. The model is end-to-end differentiable enabling training by stochastic gradient descent. On Stanford Natural Language Inference(SNLI) dataset, the proposed model achieves better accuracy numbers than all published models in literature.