CrossCheck: Input Validation for WAN Control Systems
This addresses network reliability for large-scale WAN operators by preventing outages, though it is an incremental improvement in input validation for SDN controllers.
The paper tackles the problem of invalid inputs causing network outages in WAN control systems by introducing CrossCheck, a validation system that detected a real incident with 0% false positives in deployment and identified invalid inputs as a leading cause of major outages, preventing them in simulations with high accuracy.
We present CrossCheck, a system that validates inputs to the Software-Defined Networking (SDN) controller in a Wide Area Network (WAN). By detecting incorrect inputs - often stemming from bugs in the SDN control infrastructure - CrossCheck alerts operators before they trigger network outages. Our analysis at a large-scale WAN operator identifies invalid inputs as a leading cause of major outages, and we show how CrossCheck would have prevented those incidents. We deployed CrossCheck as a shadow validation system for four weeks in a production WAN, during which it accurately detected the single incident of invalid inputs that occurred while sustaining a 0% false positive rate under normal operation, hence imposing little additional burden on operators. In addition, we show through simulation that CrossCheck reliably detects a wide range of invalid inputs (e.g., detecting demand perturbations as small as 5% with 100% accuracy) and maintains a near-zero false positive rate for realistic levels of noisy, missing, or buggy telemetry data (e.g., sustaining zero false positives with up to 30% of corrupted telemetry data).