LGMay 20, 2025

Enforcing Hard Linear Constraints in Deep Learning Models with Decision Rules

Gonzalo E. Constante-Flores, Hao Chen, Can Li

arXiv:2505.13858v120.59 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses the need for reliable constraint enforcement in safety-critical applications like physical systems or fairness, though it is incremental as it builds on existing optimization methods.

The paper tackles the problem of enforcing hard linear constraints in deep learning models for safety-critical tasks, proposing a model-agnostic framework that guarantees constraint satisfaction across the entire input space while maintaining competitive accuracy and low inference latency.

Deep learning models are increasingly deployed in safety-critical tasks where predictions must satisfy hard constraints, such as physical laws, fairness requirements, or safety limits. However, standard architectures lack built-in mechanisms to enforce such constraints, and existing approaches based on regularization or projection are often limited to simple constraints, computationally expensive, or lack feasibility guarantees. This paper proposes a model-agnostic framework for enforcing input-dependent linear equality and inequality constraints on neural network outputs. The architecture combines a task network trained for prediction accuracy with a safe network trained using decision rules from the stochastic and robust optimization literature to ensure feasibility across the entire input space. The final prediction is a convex combination of the two subnetworks, guaranteeing constraint satisfaction during both training and inference without iterative procedures or runtime optimization. We prove that the architecture is a universal approximator of constrained functions and derive computationally tractable formulations based on linear decision rules. Empirical results on benchmark regression tasks show that our method consistently satisfies constraints while maintaining competitive accuracy and low inference latency.

View on arXiv PDF

Similar