OC CG LGJul 24, 2024

Mathematical Programming Algorithms for Convex Hull Approximation with a Hyperplane Budget

Michele Barbato, Alberto Ceselli, Rosario Messana

arXiv:2407.17341v2h-index: 23

Originality Incremental advance

AI Analysis

This work addresses a computational geometry problem with applications in constraint learning for Mixed Integer Programs, offering incremental improvements in efficiency and scalability for specific datasets.

The paper tackles the problem of approximating convex hulls with a limited number of hyperplanes to separate positive and negative points, introducing mathematical programming formulations and column generation algorithms that outperform existing methods in computational time and separation error on synthetic datasets, achieving results in minutes for high-dimensional instances where prior algorithms fail.

We consider the following problem in computational geometry: given, in the d-dimensional real space, a set of points marked as positive and a set of points marked as negative, such that the convex hull of the positive set does not intersect the negative set, find K hyperplanes that separate, if possible, all the positive points from the negative ones. That is, we search for a convex polyhedron with at most K faces, containing all the positive points and no negative point. The problem is known in the literature for pure convex polyhedral approximation; our interest stems from its possible applications in constraint learning, where points are feasible or infeasible solutions of a Mixed Integer Program, and the K hyperplanes are linear constraints to be found. We cast the problem as an optimization one, minimizing the number of negative points inside the convex polyhedron, whenever exact separation cannot be achieved. We introduce models inspired by support vector machines and we design two mathematical programming formulations with binary variables. We exploit Dantzig-Wolfe decomposition to obtain extended formulations, and we devise column generation algorithms with ad-hoc pricing routines. We compare computing time and separation error values obtained by all our approaches on synthetic datasets, with number of points from hundreds up to a few thousands, showing our approaches to perform better than existing ones from the literature. Furthermore, we observe that key computational differences arise, depending on whether the budget K is sufficient to completely separate the positive points from the negative ones or not. On 8-dimensional instances (and over), existing convex hull algorithms become computational inapplicable, while our algorithms allow to identify good convex hull approximations in minutes of computation.

View on arXiv PDF

Similar