LGApr 2, 2025

CO-DEFEND: Continuous Decentralized Federated Learning for Secure DoH-Based Threat Detection

Diego Cajaraville-Aboy, Marta Moure-Garrido, Carlos Beis-Penedo, Carlos Garcia-Rubio, Rebeca P. Díaz-Redondo, Celeste Campo, Ana Fernández-Vilas, Manuel Fernández-Veiga

arXiv:2504.01882v1h-index: 16Comput. Networks

Originality Incremental advance

AI Analysis

This addresses privacy concerns in threat detection for network security entities, though it is incremental as it adapts existing ML algorithms to a federated setting.

The paper tackles the problem of detecting malicious DNS over HTTPS (DoH) tunnels in network security by proposing CO-DEFEND, a decentralized federated learning framework that enables collaborative model training without sharing data, achieving effective detection as demonstrated on the CIRA-CIC-DoHBrw-2020 dataset.

The use of DNS over HTTPS (DoH) tunneling by an attacker to hide malicious activity within encrypted DNS traffic poses a serious threat to network security, as it allows malicious actors to bypass traditional monitoring and intrusion detection systems while evading detection by conventional traffic analysis techniques. Machine Learning (ML) techniques can be used to detect DoH tunnels; however, their effectiveness relies on large datasets containing both benign and malicious traffic. Sharing such datasets across entities is challenging due to privacy concerns. In this work, we propose CO-DEFEND (Continuous Decentralized Federated Learning for Secure DoH-Based Threat Detection), a Decentralized Federated Learning (DFL) framework that enables multiple entities to collaboratively train a classification machine learning model while preserving data privacy and enhancing resilience against single points of failure. The proposed DFL framework, which is scalable and privacy-preserving, is based on a federation process that allows multiple entities to train online their local models using incoming DoH flows in real time as they are processed by the entity. In addition, we adapt four classical machine learning algorithms, Support Vector Machines (SVM), Logistic Regression (LR), Decision Trees (DT), and Random Forest (RF), for federated scenarios, comparing their results with more computationally complex alternatives such as neural networks. We compare our proposed method by using the dataset CIRA-CIC-DoHBrw-2020 with existing machine learning approaches to demonstrate its effectiveness in detecting malicious DoH tunnels and the benefits it brings.

View on arXiv PDF

Similar