A Logic-Driven Framework for Consistency of Neural Models
This work addresses inconsistency in neural models, which is a problem for AI reliability, but it is incremental as it builds on existing regularization and logic-based methods.
The paper tackles the problem of neural models having inconsistent internal beliefs across examples by formalizing inconsistency as a generalization of prediction error and proposing a logic-driven framework to regularize models using logic rules. Experiments on natural language inference show that this approach helps make predictions both accurate and consistent.
While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples. In this paper, we formalize such inconsistency as a generalization of prediction error. We propose a learning framework for constraining models using logic rules to regularize them away from inconsistency. Our framework can leverage both labeled and unlabeled examples and is directly compatible with off-the-shelf learning schemes without model redesign. We instantiate our framework on natural language inference, where experiments show that enforcing invariants stated in logic can help make the predictions of neural models both accurate and consistent.