CLJun 10, 2025
Brevity is the soul of sustainability: Characterizing LLM response lengthsSoham Poddar, Paramita Koley, Janardan Misra et al.
A significant portion of the energy consumed by Large Language Models (LLMs) arises from their inference processes; hence developing energy-efficient methods for inference is crucial. While several techniques exist for inference optimization, output compression remains relatively unexplored, with only a few preliminary efforts addressing this aspect. In this work, we first benchmark 12 decoder-only LLMs across 5 datasets, revealing that these models often produce responses that are substantially longer than necessary. We then conduct a comprehensive quality assessment of LLM responses, formally defining six information categories present in LLM responses. We show that LLMs often tend to include redundant or additional information besides the minimal answer. To address this issue of long responses by LLMs, we explore several simple and intuitive prompt-engineering strategies. Empirical evaluation shows that appropriate prompts targeting length reduction and controlling information content can achieve significant energy optimization between 25-60\% by reducing the response length while preserving the quality of LLM responses.
CLFeb 8, 2025
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language ModelsSoham Poddar, Paramita Koley, Janardan Misra et al.
Large language models (LLMs) are increasingly recognized for their exceptional generative capabilities and versatility across various tasks. However, the high inference costs associated with these models have not received adequate attention, particularly when compared to the focus on training costs in existing research. In response to this gap, our study conducts a comprehensive benchmarking of LLM inference energy across a wide range of NLP tasks, where we analyze the impact of different models, tasks, prompts, and system-related factors on inference energy. Specifically, our experiments reveal several interesting insights, including strong correlation of inference energy with output token length and response time. Also, we find that quantization and optimal batch sizes, along with targeted prompt phrases, can significantly reduce energy usage. This study is the first to thoroughly benchmark LLM inference across such a diverse range of aspects, providing insights and offering several recommendations for improving energy efficiency in model deployment.
LGFeb 9
Benchmarking the Energy Savings with Speculative Decoding StrategiesRohit Dutta, Paramita Koley, Soham Poddar et al.
Speculative decoding has emerged as an effective method to reduce latency and inference cost of LLM inferences. However, there has been inadequate attention towards the energy requirements of these models. To address this gap, this paper presents a comprehensive survey of energy requirements of speculative decoding strategies, with detailed analysis on how various factors -- model size and family, speculative decoding strategies, and dataset characteristics -- influence the energy optimizations.
SEApr 19, 2021
When to Build Quantum Software?Janardan Misra, Vikrant Kaulgud, Rupesh Kaslay et al.
Despite ongoing advancements in quantum computing, businesses are still faced with the problem to decide if they would benefit from investing into this novel technology for building a business critical application. This uncertainty is not only owing to the limitations in the current state of the technology but also due to the gap between the level at which business applications are analyzed (e.g., using high level semi-formal languages) and the level at which quantum computing related information is currently available (e.g., formally specified computational problems, their algorithmic solutions with computational complexity theoretic analysis) to make informed decisions. To fill the discourse gap, in this paper, we present design of an interactive advisor, which augments users while deciding to invest into quantum software development as a plausible future option in their application context. Towards that we apply business process modeling and natural language similarity analysis using text-embeddings to associated business context with computational problems and formulate constraints in terms of quantum speedup and resource requirements to select software development platforms.
CLFeb 8, 2020
autoNLP: NLP Feature Recommendations for Text Analytics ApplicationsJanardan Misra
While designing machine learning based text analytics applications, often, NLP data scientists manually determine which NLP features to use based upon their knowledge and experience with related problems. This results in increased efforts during feature engineering process and renders automated reuse of features across semantically related applications inherently difficult. In this paper, we argue for standardization in feature specification by outlining structure of a language for specifying NLP features and present an approach for their reuse across applications to increase likelihood of identifying optimal features.
NEJun 24, 2018
Computational Complexity of Observing Evolution in Artificial-Life FormsJanardan Misra
Observations are an essential component of the simulation based studies on artificial-evolutionary systems (AES) by which entities are identified and their behavior is observed to uncover higher-level "emergent" phenomena. Because of the heterogeneity of AES models and implicit nature of observations, precise characterization of the observation process, independent of the underlying micro-level reaction semantics of the model, is a difficult problem. Building upon the multiset based algebraic framework to characterize state-space trajectory of AES model simulations, we estimate bounds on computational resource requirements of the process of automatically discovering life-like evolutionary behavior in AES models during simulations. For illustration, we consider the case of Langton's Cellular Automata model and characterize the worst case computational complexity bounds for identifying entity and population level reproduction.
CRJun 23, 2018
A Recursive PLS (Partial Least Squares) based Approach for Enterprise Threat ManagementJanardan Misra
Most of the existing solutions to enterprise threat management are preventive approaches prescribing means to prevent policy violations with varying degrees of success. In this paper we consider the complementary scenario where a number of security violations have already occurred, or security threats, or vulnerabilities have been reported and a security administrator needs to generate optimal response to these security events. We present a principled approach to study and model the human expertise in responding to the emergent threats owing to these security events. A recursive Partial Least Squares based adaptive learning model is defined using a factorial analysis of the security events together with a method for estimating the effect of global context dependent semantic information used by the security administrators. Presented model is theoretically optimal and operationally recursive in nature to deal with the set of security events being generated continuously. We discuss the underlying challenges and ways in which the model could be operationalized in centralized versus decentralized, and real-time versus batch processing modes.
AIJun 23, 2018
An Inductive Formalization of Self Reproduction in Dynamical HierarchiesJanardan Misra
Formalizing self reproduction in dynamical hierarchies is one of the important problems in Artificial Life (AL) studies. We study, in this paper, an inductively defined algebraic framework for self reproduction on macroscopic organizational levels under dynamical system setting for simulated AL models and explore some existential results. Starting with defining self reproduction for atomic entities we define self reproduction with possible mutations on higher organizational levels in terms of hierarchical sets and the corresponding inductively defined `meta' - reactions. We introduce constraints to distinguish a collection of entities from genuine cases of emergent organizational structures.
SEJun 21, 2018
Data-Driven Application Maintenance: Views from the TrenchesJanardan Misra, Shubhashis Sengupta, Divya Rawat et al.
In this paper we present our experience during design, development, and pilot deployments of a data-driven machine learning based application maintenance solution. We implemented a proof of concept to address a spectrum of interrelated problems encountered in application maintenance projects including duplicate incident ticket identification, assignee recommendation, theme mining, and mapping of incidents to business processes. In the context of IT services, these problems are frequently encountered, yet there is a gap in bringing automation and optimization. Despite long-standing research around mining and analysis of software repositories, such research outputs are not adopted well in practice due to the constraints these solutions impose on the users. We discuss need for designing pragmatic solutions with low barriers to adoption and addressing right level of complexity of problems with respect to underlying business constraints and nature of data.
MLJun 16, 2016
Designing Intelligent Automation based Solutions for Complex Social ProblemsSanjay Podder, Janardan Misra, Senthil Kumaresan et al.
Deciding effective and timely preventive measures against complex social problems affecting relatively low income geographies is a difficult challenge. There is a strong need to adopt intelligent automation based solutions with low cost imprints to tackle these problems at larger scales. Starting with the hypothesis that analytical modelling and analysis of social phenomena with high accuracy is in general inherently hard, in this paper we propose design framework to enable data-driven machine learning based adaptive solution approach towards enabling more effective preventive measures. We use survey data collected from a socio-economically backward region of India about adolescent girls to illustrate the design approach.
SEAug 31, 2012
Java Source-code Clustering: Unifying Syntactic and Semantic FeaturesJanardan Misra, Vikrant Kaulgud, Gary Titus et al.
This is a companion draft to paper 'Software Clustering: Unifying Syntactic and Semantic Features', in proceedings of the 19th Working Conference on Reverse Engineering (WCRE 2012). It discusses the clustering process in detail, which appeared in the paper in an abridged form. It also contains certain additional process steps which were not covered in the WCRE paper. The clustering process is described for applications with Java source-code. However, as argued in the WCRE paper, it can be seamlessly adapted to many other programming paradigms.