Does Symbolic Knowledge Prevent Adversarial Fooling?
This addresses a potential vulnerability in hybrid AI systems for researchers and practitioners, but it is incremental as it highlights an unintended consequence rather than proposing a new solution.
The paper investigates whether symbolic knowledge in neural architectures can inadvertently propagate adversarial effects, focusing on deep probabilistic graphical models.
Arguments in favor of injecting symbolic knowledge into neural architectures abound. When done right, constraining a sub-symbolic model can substantially improve its performance and sample complexity and prevent it from predicting invalid configurations. Focusing on deep probabilistic (logical) graphical models -- i.e., constrained joint distributions whose parameters are determined (in part) by neural nets based on low-level inputs -- we draw attention to an elementary but unintended consequence of symbolic knowledge: that the resulting constraints can propagate the negative effects of adversarial examples.