Improving DNS Exfiltration Detection via Transformer Pretraining
This work addresses the need for effective DNS exfiltration detection in network security, but the improvements are incremental over existing transformer-based methods.
The paper investigates whether in-domain pretraining of BERT improves DNS exfiltration detection at low false positive rates, finding significant improvements in the left tail of the ROC curve compared to random initialization.
We study whether in-domain pretraining of Bidirectional Encoder Representations from Transformer (BERT) model improves subdomain-level detection of exfiltration at low false positive rates. While previous work mostly examines fine-tuned generic Transformers, it does not aim to isolate the effect of pretraining on the downstream task of classification. To address this gap, we develop a controlled pipeline where we freeze operating points on validation and transfer them to the test set, thus enabling clean ablations across different label and pretraining budgets. Our results show significant improvements in the left tail of the Receiver Operating Characteristic (ROC) curve, especially against randomly initialized baseline. Additionally, within pretrained model variants, increasing the number of pretraining steps helps the most when more labeled data are available for fine-tuning.