Akhil Kumar

AI
h-index1
7papers
66citations
Novelty46%
AI Score46

7 Papers

AIMay 27
Mind Your Tone: Does Tone Alter LLM Performance?

Om Dobariya, Akhil Kumar

The use of Large Language Models (LLMs) is proliferating, yet their performance is observed to vary based on prompting styles and tones. In this study, we investigate both whether and how tonal variations in prompts lead to disparate LLM accuracy for objective multiple-choice questions. We use two datasets: a 50-base question dataset with five tone variants and a 570-base question MMLU subset spanning 57 subjects with seven tone variants. Experiments were conducted to evaluate the performance of four cost-efficient, popular LLMs: ChatGPT-4o, ChatGPT-5-nano, Gemini 2.5 Flash, and Gemini 2.5 Flash Lite. Across models, tonal effects are systematic but highly model-dependent. Some models show small, yet statistically significant, shifts, while others exhibit large accuracy swings across tones. Further, we identify subject-level differences in tone sensitivity and present a routing framework to explain how tones may attune internal reasoning modes. Our findings caution users against assuming tone-robust reliability in LLM deployments.

AIOct 8, 2025
Evaluation of LLMs for Process Model Analysis and Optimization

Akhil Kumar, Jianliang Leon Zhao, Om Dobariya

In this paper, we report our experience with several LLMs for their ability to understand a process model in an interactive, conversational style, find syntactical and logical errors in it, and reason with it in depth through a natural language (NL) interface. Our findings show that a vanilla, untrained LLM like ChatGPT (model o3) in a zero-shot setting is effective in understanding BPMN process models from images and answering queries about them intelligently at syntactic, logic, and semantic levels of depth. Further, different LLMs vary in performance in terms of their accuracy and effectiveness. Nevertheless, our empirical analysis shows that LLMs can play a valuable role as assistants for business process designers and users. We also study the LLM's "thought process" and ability to perform deeper reasoning in the context of process analysis and optimization. We find that the LLMs seem to exhibit anthropomorphic properties.

CLOct 6, 2025
Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper)

Om Dobariya, Akhil Kumar

The wording of natural language prompts has been shown to influence the performance of large language models (LLMs), yet the role of politeness and tone remains underexplored. In this study, we investigate how varying levels of prompt politeness affect model accuracy on multiple-choice questions. We created a dataset of 50 base questions spanning mathematics, science, and history, each rewritten into five tone variants: Very Polite, Polite, Neutral, Rude, and Very Rude, yielding 250 unique prompts. Using ChatGPT 4o, we evaluated responses across these conditions and applied paired sample t-tests to assess statistical significance. Contrary to expectations, impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite prompts to 84.8% for Very Rude prompts. These findings differ from earlier studies that associated rudeness with poorer outcomes, suggesting that newer LLMs may respond differently to tonal variation. Our results highlight the importance of studying pragmatic aspects of prompting and raise broader questions about the social dimensions of human-AI interaction.

AIDec 24, 2020
Towards a Formal Framework for Partial Compliance of Business Processes

Ho-Pun Lam, Mustafa Hashmi, Akhil Kumar

Binary "YES-NO" notions of process compliance are not very helpful to managers for assessing the operational performance of their company because a large number of cases fall in the grey area of partial compliance. Hence, it is necessary to have ways to quantify partial compliance in terms of metrics and be able to classify actual cases by assigning a numeric value of compliance to them. In this paper, we formulate an evaluation framework to quantify the level of compliance of business processes across different levels of abstraction (such as task,trace and process level) and across multiple dimensions of each task (such as temporal, monetary, role-, data-, and quality-related) to provide managers more useful information about their operations and to help them improve their decision making processes. Our approach can also add social value by making social services provided by local, state and federal governments more flexible and improving the lives of citizens.

CLMay 16, 2019
TraceWalk: Semantic-based Process Graph Embedding for Consistency Checking

Chen Qian, Lijie Wen, Akhil Kumar

Process consistency checking (PCC), an interdiscipline of natural language processing (NLP) and business process management (BPM), aims to quantify the degree of (in)consistencies between graphical and textual descriptions of a process. However, previous studies heavily depend on a great deal of complex expert-defined knowledge such as alignment rules and assessment metrics, thus suffer from the problems of low accuracy and poor adaptability when applied in open-domain scenarios. To address the above issues, this paper makes the first attempt that uses deep learning to perform PCC. Specifically, we proposed TraceWalk, using semantic information of process graphs to learn latent node representations, and integrates it into a convolutional neural network (CNN) based model called TraceNet to predict consistencies. The theoretical proof formally provides the PCC's lower limit and experimental results demonstrate that our approach performs more accurately than state-of-the-art baselines.

CLMay 16, 2019
An Approach for Process Model Extraction By Multi-Grained Text Classification

Chen Qian, Lijie Wen, Akhil Kumar et al.

Process model extraction (PME) is a recently emerged interdiscipline between natural language processing (NLP) and business process management (BPM), which aims to extract process models from textual descriptions. Previous process extractors heavily depend on manual features and ignore the potential relations between clues of different text granularities. In this paper, we formalize the PME task into the multi-grained text classification problem, and propose a hierarchical neural network to effectively model and extract multi-grained information without manually-defined procedural features. Under this structure, we accordingly propose the coarse-to-fine (grained) learning mechanism, training multi-grained tasks in coarse-to-fine grained order to share the high-level knowledge for the low-level tasks. To evaluate our approach, we construct two multi-grained datasets from two different domains and conduct extensive experiments from different dimensions. The experimental results demonstrate that our approach outperforms the state-of-the-art methods with statistical significance and further investigations demonstrate its effectiveness.

SEApr 22, 2019
BePT: A Behavior-based Process Translator for Interpreting and Understanding Process Models

Chen Qian, Lijie Wen, Akhil Kumar

Sharing process models on the web has emerged as a common practice. Users can collect and share their experimental process models with others. However, some users always feel confused about the shared process models for lack of necessary guidelines or instructions. Therefore, several process translators have been proposed to explain the semantics of process models in natural language (NL). We find that previous studies suffer from information loss and generate semantically erroneous descriptions that diverge from original model behaviors. In this paper, we propose a novel process translator named BePT (Behavior-based Process Translator) based on the encoder-decoder paradigm, encoding a process model into a middle representation and decoding the representation into NL descriptions. Our theoretical analysis demonstrates that BePT satisfies behavior correctness, behavior completeness and description minimality. The qualitative and quantitative experiments show that BePT outperforms the state-of-the-art baselines.