IF-CPS: Influence Functions for Cyber-Physical Systems -- A Unified Framework for Diagnosis, Curation, and Safety Attribution
This addresses the lack of data attribution tools for practitioners deploying neural network controllers in cyber-physical systems, offering a domain-specific solution.
The paper tackles the problem of tracing controller failures in cyber-physical systems back to training data by proposing IF-CPS, a framework that adapts influence functions for CPS-specific properties, achieving improvements such as AUROC 1.00 in Pendulum and 0.92 vs. 0.50 in HVAC benchmarks.
Neural network controllers trained via behavior cloning are increasingly deployed in cyber-physical systems (CPS), yet practitioners lack tools to trace controller failures back to training data. Existing data attribution methods assume i.i.d.\ data and standard loss targets, ignoring CPS-specific properties: closed-loop dynamics, safety constraints, and temporal trajectory structure. We propose IF-CPS, a modular influence function framework with three CPS-adapted variants: safety influence (attributing constraint violations), trajectory influence (temporal discounting over trajectories), and propagated influence (tracing effects through plant dynamics). We evaluate IF-CPS on six benchmarks across diagnosis, curation, and safety attribution tasks. IF-CPS improves over standard influence functions in the majority of settings, achieving AUROC $1.00$ in Pendulum (5-10\% poisoning), $0.92$ vs.\ $0.50$ in HVAC (10\%), and the strongest constraint-boundary correlation (Spearman $Ï= 0.55$ in Pendulum).