6.7CRMay 19
Detecting Data Exfiltration through I2P Anonymity Networks: A Two-Phase Machine Learning ApproachSiddique Abubakr Muntaka, Muntaka Mohammed, Mansuru Mikail Azindo et al.
The Invisible Internet Project (I2P) provides strong anonymity through garlic routing and distributed network architecture, making it attractive for legitimate privacy needs. Nevertheless, the same properties can be exploited by malicious actors to steal sensitive information from corporate networks without detection. Current network security measures often fail to detect I2P traffic, and existing literature has focused primarily on protocol-level traffic identification without addressing behavioral threat assessment. This paper proposes a two-stage machine-learning model for I2P traffic analysis using the SafeSurf Darknet 2025 dataset comprising 184,548 network flows. Phase 1 achieved 99.96% accuracy in distinguishing I2P traffic from normal network traffic using a Random Forest classifier, with only 2 false positives among 32,318 normal flows. Phase 2 performed behavioral analysis on traffic identified as I2P, classifying it as either exfiltration or legitimate activity, achieving 91.11% accuracy using XGBoost. The system demonstrates that tree-based ensemble methods substantially outperform deep neural networks and support vector machines for this task. Feature importance analysis indicates that the most discriminative features are packet timing and flow duration. These findings establish that accurate I2P traffic detection and threat prioritization are achievable in operational network environments, enabling security teams to focus resources on high-risk events rather than monitoring all encrypted traffic.
4.1DCMay 18
iHAC: A Hybrid Cluster Architecture for Enhanced Performance and ResilienceSiddique Abubakr Muntaka, Edward Danso Ansong, Benjamin Yankson et al.
Uninterrupted system availability is a critical requirement for enterprise operations, yet traditional high-availability clusters suffer from limitations such as single points of failure and inefficient resource allocation. This paper introduces and evaluates the Integrated High Availability Cluster (iHAC), a hybrid architecture designed to enhance system resilience and performance. The iHAC integrates the strengths of active-active and active-passive configurations to optimize workload distribution and failover capabilities. We conducted a comparative analysis, simulating iHAC against conventional (legacy) clusters using Riverbed Modeler (OPNET). The results reveal significant performance improvements: iHAC reduced the average HTTP page response time by over 40%, from five seconds in a traditional active-active setup to under three seconds. This was achieved alongside reduced network latency and increased overall throughput. This study validates the iHAC architecture as a superior design for building robust, high-performance systems, offering a practical path to greater operational continuity and resilience.
1.6LGApr 5
Conformal PM2.5 Mapping Under Spatial Covariate Shift: Satellite-Reanalysis Fusion for Africa's Green Industrial TransitionYaw Osei Adjei, Davis Opoku, Ephraim Abotsi et al.
Africa's green industrialization imperative demands reliable infrastructure for monitoring air quality. We present a satellite-reanalysis PM2.5 fusion system trained on 2,068,901 records from 404 monitoring locations in 29 African countries (OpenAQ, 2017-2022), combining LightGBM with leakage-resistant spatial cross-validation and conformal prediction to quantify predictions and their geographic applicability limits. Under 5-fold location-grouped spatial cross-validation, LightGBM achieves RMSE = 30.83 +/- 5.07 ug/m3, MAE = 14.54 +/- 1.66 ug/m3, R2 = 0.134 +/- 0.023, and macro F1 = 0.336 +/- 0.018. This R2 is substantially below random-split benchmarks (>0.90) but reflects true geographic generalisation difficulty rather than model failure. Split conformal prediction targeting 90% marginal coverage reveals severe East Africa degradation (actual PICP = 65.3% vs. nominal 90%), consistent with medium-strength covariate shift (humidity KS = 0.2237, sat_pblh KS = 0.2558). We operationalise these findings through regional reliability flags (High/Medium/Low/Unreliable) and a monitor prioritisation score directing infrastructure expansion toward highest-burden unmonitored populations, directly supporting Africa's green industrial transition and SDGs 3.9, 7.1.2, 9, 11.6.2, and 13.