Evaluating Relaxations of Logic for Neural Networks: A Comprehensive Study
This work addresses the challenge of effectively integrating symbolic knowledge into neural models for improved performance in low-data scenarios, providing practical guidelines for researchers and practitioners in machine learning.
The paper tackles the problem of selecting the best logical relaxation for incorporating symbolic knowledge into neural networks, finding that the Lukasiewicz t-norm performs best theoretically for preserving tautologies, while the product t-norm achieves superior predictive performance empirically on text chunking and digit recognition tasks.
Symbolic knowledge can provide crucial inductive bias for training neural models, especially in low data regimes. A successful strategy for incorporating such knowledge involves relaxing logical statements into sub-differentiable losses for optimization. In this paper, we study the question of how best to relax logical expressions that represent labeled examples and knowledge about a problem; we focus on sub-differentiable t-norm relaxations of logic. We present theoretical and empirical criteria for characterizing which relaxation would perform best in various scenarios. In our theoretical study driven by the goal of preserving tautologies, the Lukasiewicz t-norm performs best. However, in our empirical analysis on the text chunking and digit recognition tasks, the product t-norm achieves best predictive performance. We analyze this apparent discrepancy, and conclude with a list of best practices for defining loss functions via logic.