Devesh Kumar

CL
4papers
82citations
Novelty31%
AI Score35

4 Papers

DBSep 18, 2020Code
TODS: An Automated Time Series Outlier Detection System

Kwei-Herng Lai, Daochen Zha, Guanchu Wang et al.

We present TODS, an automated Time Series Outlier Detection System for research and industrial applications. TODS is a highly modular system that supports easy pipeline construction. The basic building block of TODS is primitive, which is an implementation of a function with hyperparameters. TODS currently supports 70 primitives, including data processing, time series processing, feature analysis, detection algorithms, and a reinforcement module. Users can freely construct a pipeline using these primitives and perform end- to-end outlier detection with the constructed pipeline. TODS provides a Graphical User Interface (GUI), where users can flexibly design a pipeline with drag-and-drop. Moreover, a data-driven searcher is provided to automatically discover the most suitable pipelines given a dataset. TODS is released under Apache 2.0 license at https://github.com/datamllab/tods.

CLJun 19, 2025
Cyberbullying Detection in Hinglish Text Using MURIL and Explainable AI

Devesh Kumar

The growth of digital communication platforms has led to increased cyberbullying incidents worldwide, creating a need for automated detection systems to protect users. The rise of code-mixed Hindi-English (Hinglish) communication on digital platforms poses challenges for existing cyberbullying detection systems, which were designed primarily for monolingual text. This paper presents a framework for cyberbullying detection in Hinglish text using the Multilingual Representations for Indian Languages (MURIL) architecture to address limitations in current approaches. Evaluation across six benchmark datasets -- Bohra \textit{et al.}, BullyExplain, BullySentemo, Kumar \textit{et al.}, HASOC 2021, and Mendeley Indo-HateSpeech -- shows that the MURIL-based approach outperforms existing multilingual models including RoBERTa and IndicBERT, with improvements of 1.36 to 13.07 percentage points and accuracies of 86.97\% on Bohra, 84.62\% on BullyExplain, 86.03\% on BullySentemo, 75.41\% on Kumar datasets, 83.92\% on HASOC 2021, and 94.63\% on Mendeley dataset. The framework includes explainability features through attribution analysis and cross-linguistic pattern recognition. Ablation studies show that selective layer freezing, appropriate classification head design, and specialized preprocessing for code-mixed content improve detection performance, while failure analysis identifies challenges including context-dependent interpretation, cultural understanding, and cross-linguistic sarcasm detection, providing directions for future research in multilingual cyberbullying detection.

CLJun 19, 2025
A Hybrid DeBERTa and Gated Broad Learning System for Cyberbullying Detection in English Text

Devesh Kumar

The proliferation of online communication platforms has created unprecedented opportunities for global connectivity while simultaneously enabling harmful behaviors such as cyberbullying, which affects approximately 54.4\% of teenagers according to recent research. This paper presents a hybrid architecture that combines the contextual understanding capabilities of transformer-based models with the pattern recognition strengths of broad learning systems for effective cyberbullying detection. This approach integrates a modified DeBERTa model augmented with Squeeze-and-Excitation blocks and sentiment analysis capabilities with a Gated Broad Learning System (GBLS) classifier, creating a synergistic framework that outperforms existing approaches across multiple benchmark datasets. The proposed ModifiedDeBERTa + GBLS model achieved good performance on four English datasets: 79.3\% accuracy on HateXplain, 95.41\% accuracy on SOSNet, 91.37\% accuracy on Mendeley-I, and 94.67\% accuracy on Mendeley-II. Beyond performance gains, the framework incorporates comprehensive explainability mechanisms including token-level attribution analysis, LIME-based local interpretations, and confidence calibration, addressing critical transparency requirements in automated content moderation. Ablation studies confirm the meaningful contribution of each architectural component, while failure case analysis reveals specific challenges in detecting implicit bias and sarcastic content, providing valuable insights for future improvements in cyberbullying detection systems.

CRJan 21, 2020
Investigation of Data Deletion Vulnerabilities in NAND Flash Memory Based Storage

Abhilash Garg, Supriya Chakraborty, Manoj Malik et al.

Semiconductor NAND Flash based memory technology dominates the electronic Non-Volatile storage media market. Though NAND Flash offers superior performance and reliability over conventional magnetic HDDs, yet it suffers from certain data-security vulnerabilities. Such vulnerabilities can expose sensitive information stored on the media to security risks. It is thus necessary to study in detail the fundamental reasons behind data-security vulnerabilities of NAND Flash for use in critical applications. In this paper, the problem of unreliable data-deletion/sanitization in commercial NAND Flash media is investigated along with the fundamental reasons leading to such vulnerabilities. Exhaustive software based data recovery experiments (multiple iterations) has been carried out on commercial NAND Flash storage media (8 GB and 16 GB) for different types of filesystems (NTFS and FAT) and OS specific delete/Erase instructions. 100 % data recovery is obtained for windows and linux based delete/Erase commands. Inverse effect of performance enhancement techniques like wear levelling, bad block management etc. is also observed with the help of software based recovery experiments.