Rohit Dube

CR
5papers
2citations
Novelty37%
AI Score39

5 Papers

16.9CRApr 9
Why Network Segmentation Projects Fail

Rohit Dube

Network segmentation is a foundational enterprise security control. Despite its recognized benefits, segmentation initiatives frequently fail in practice, and the field lacks a systematic empirical explanation for why these projects do not achieve their intended outcomes. This paper presents an empirical study of failed segmentation projects based on a survey of 400 U.S.-based\ network security practitioners. The survey was grounded in a two-part failure framework that separately measures general IT project failure factors and segmentation-specific technical and operational barriers. Clustering analysis of the responses reveals four distinct failure archetypes. Surprisingly, practitioners across all four archetypes propose general IT project management fixes over segmentation-specific fixes in the same ratio.

60.3SIApr 8
How segmented is my network?

Rohit Dube

Network segmentation is a popular security practice for limiting lateral movement, yet practitioners lack a metric to measure how segmented a network actually is. We define segmentedness as the fraction of potential node-pair communications disallowed by policy -- equivalently, the complement of graph edge density -- and show it to be the first statistically principled scalar metric for this purpose. Then, we derive a normalized estimator for segmentedness and evaluate its uncertainty using confidence intervals. For a 95\% confidence interval with a margin-of-error of $\pm 0.1$, we show that a minimum of $M=97$ sampled node pairs is sufficient. This result is independent of the total number of nodes in the network, provided that node pairs are sampled uniformly at random. We evaluate the estimator through Monte Carlo simulations on Erdős--Rényi, stochastic block models, and real-world enterprise network datasets, demonstrating accurate estimation. Finally, we discuss applications of the estimator, such as baseline tracking, zero trust assessment, and merger integration.

CRAug 9, 2025
Per-sender neural network classifiers for email authorship validation

Rohit Dube

Business email compromise and lateral spear phishing attacks are among modern organizations' most costly and damaging threats. While inbound phishing defenses have improved significantly, most organizations still trust internal emails by default, leaving themselves vulnerable to attacks from compromised employee accounts. In this work, we define and explore the problem of authorship validation: verifying whether a claimed sender actually authored a given email. Authorship validation is a lightweight, real-time defense that complements traditional detection methods by modeling per-sender writing style. Further, the paper presents a collection of new datasets based on the Enron corpus. These simulate inauthentic messages using both human-written and large language model-generated emails. The paper also evaluates two classifiers -- a Naive Bayes model and a character-level convolutional neural network (Char-CNN) -- for the authorship validation task. Our experiments show that the Char-CNN model achieves high accuracy and F1 scores under various circumstances. Finally, we discuss deployment considerations and show that per-sender authorship classifiers are practical for integrating into existing commercial email security systems with low overhead.

LGSep 4, 2023
Learning for Interval Prediction of Electricity Demand: A Cluster-based Bootstrapping Approach

Rohit Dube, Natarajan Gautam, Amarnath Banerjee et al.

Accurate predictions of electricity demands are necessary for managing operations in a small aggregation load setting like a Microgrid. Due to low aggregation, the electricity demands can be highly stochastic and point estimates would lead to inflated errors. Interval estimation in this scenario, would provide a range of values within which the future values might lie and helps quantify the errors around the point estimates. This paper introduces a residual bootstrap algorithm to generate interval estimates of day-ahead electricity demand. A machine learning algorithm is used to obtain the point estimates of electricity demand and respective residuals on the training set. The obtained residuals are stored in memory and the memory is further partitioned. Days with similar demand patterns are grouped in clusters using an unsupervised learning algorithm and these clusters are used to partition the memory. The point estimates for test day are used to find the closest cluster of similar days and the residuals are bootstrapped from the chosen cluster. This algorithm is evaluated on the real electricity demand data from EULR(End Use Load Research) and is compared to other bootstrapping methods for varying confidence intervals.

CRJul 12, 2021
Understanding the Communist Party of China's Information Operations

Rohit Dube

The Communist Party of China is known to engage in Information Operations to influence public opinion. In this paper, we seek to understand the tactics used by the Communist Party in a recent Information Operation - the one conducted to influence the narrative around the pro-democracy movement in Hong Kong. We use a Twitter dataset containing account information and tweets for the operation. Our research shows that the Hong Kong operation was (at least) partially conducted manually by humans rather than entirely by automated bots. We also show that the Communist Party mixed in personal attacks on Chinese dissidents and messages on COVID-19 with the party's views on the protests during the operation. Finally, we conclude that the Information Operation network in the Twitter dataset was set up to amplify content generated elsewhere rather than to influence the narrative with original content.