Aniket Abhishek Soni

CL
h-index2
4papers
4citations
Novelty56%
AI Score41

4 Papers

SDMar 26, 2025Code
Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit

Aniket Abhishek Soni

Although speech recognition algorithms have developed quickly in recent years, achieving high transcription accuracy across diverse audio formats and acoustic environments remains a major challenge. This work explores how incorporating custom language models with the open-source Vosk Toolkit can improve speech-to-text accuracy in varied settings. Unlike many conventional systems limited to specific audio types, this approach supports multiple audio formats such as WAV, MP3, FLAC, and OGG by using Python modules for preprocessing and format conversion. A Python-based transcription pipeline was developed to process input audio, perform speech recognition using Vosk's KaldiRecognizer, and export the output to a DOCX file. Results showed that custom models reduced word error rates, especially in domain-specific scenarios involving technical terminology, varied accents, or background noise. This work presents a cost-effective, offline solution for high-accuracy transcription and opens up future opportunities for automation and real-time applications.

SEJan 15
Reinforcement Learning for Dynamic Workflow Optimization in CI/CD Pipelines

Aniket Abhishek Soni, Milan Parikh, Rashi Nimesh Kumar Dhenia et al.

Continuous Integration and Continuous Deployment (CI/CD) pipelines are central to modern software delivery, yet their static workflows often introduce inefficiencies as systems scale. This paper proposes a reinforcement learning (RL) based approach to dynamically optimize CI/CD pipeline workflows. The pipeline is modeled as a Markov Decision Process, and an RL agent is trained to make runtime decisions such as selecting full, partial, or no test execution in order to maximize throughput while minimizing testing overhead. A configurable CI/CD simulation environment is developed to evaluate the approach across build, test, and deploy stages. Experimental results show that the RL optimized pipeline achieves up to a 30 percent improvement in throughput and approximately a 25 percent reduction in test execution time compared to static baselines, while maintaining a defect miss rate below 5 percent. The agent learns to selectively skip or abbreviate tests for low risk commits, accelerating feedback cycles without significantly increasing failure risk. These results demonstrate the potential of reinforcement learning to enable adaptive and intelligent DevOps workflows, providing a practical pathway toward more efficient, resilient, and sustainable CI/CD automation.

CLJun 5, 2025
Dynamic Context Tuning for Retrieval-Augmented Generation: Enhancing Multi-Turn Planning and Tool Adaptation

Jubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey et al.

Retrieval-Augmented Generation (RAG) has significantly advanced large language models (LLMs) by grounding their outputs in external tools and knowledge sources. However, existing RAG systems are typically constrained to static, single-turn interactions with fixed toolsets, making them ill-suited for dynamic domains such as healthcare and smart homes, where user intent, available tools, and contextual factors evolve over time. We present Dynamic Context Tuning (DCT), a lightweight framework that extends RAG to support multi-turn dialogue and evolving tool environments without requiring retraining. DCT integrates an attention-based context cache to track relevant past information, LoRA-based retrieval to dynamically select domain-specific tools, and efficient context compression to maintain inputs within LLM context limits. Experiments on both synthetic and real-world benchmarks show that DCT improves plan accuracy by 14% and reduces hallucinations by 37%, while matching GPT-4 performance at significantly lower cost. Furthermore, DCT generalizes to previously unseen tools, enabling scalable and adaptable AI assistants across a wide range of dynamic environments.

CRNov 26, 2024
Combining Threat Intelligence with IoT Scanning to Predict Cyber Attack

Jubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey et al.

While the Web has become a global platform for communication, malicious actors, including hackers and hacktivist groups, often disseminate ideological content and coordinate activities through the "Dark Web", an obscure counterpart of the conventional web. Presently, challenges such as information overload and the fragmented nature of cyber threat data impede comprehensive profiling of these actors, thereby limiting the efficacy of predictive analyses of their online activities. Concurrently, the proliferation of internet-connected devices has surpassed the global human population, with this disparity projected to widen as the Internet of Things (IoT) expands. Technical communities are actively advancing IoT-related research to address its growing societal integration. This paper proposes a novel predictive threat intelligence framework designed to systematically collect, analyze, and visualize Dark Web data to identify malicious websites and correlate this information with potential IoT vulnerabilities. The methodology integrates automated data harvesting, analytical techniques, and visual mapping tools, while also examining vulnerabilities in IoT devices to assess exploitability. By bridging gaps in cybersecurity research, this study aims to enhance predictive threat modeling and inform policy development, thereby contributing to intelligence research initiatives focused on mitigating cyber risks in an increasingly interconnected digital ecosystem.