A framework for large-scale distributed AI search across disconnected heterogeneous infrastructures
This framework addresses the problem of enabling large-scale AI searches for eScience applications in resource-constrained or unreliable environments, though it appears incremental as it builds on existing distributed computing concepts.
The authors tackled the problem of large-scale distributed AI search across disconnected heterogeneous infrastructures by developing a robust framework that does not require dedicated machines or communication between nodes, and demonstrated its applicability by solving a challenging open problem in computational mathematics, scaling to computations previously impossible in practice.
We present a framework for a large-scale distributed eScience Artificial Intelligence search. Our approach is generic and can be used for many different problems. Unlike many other approaches, we do not require dedicated machines, homogeneous infrastructure or the ability to communicate between nodes. We give special consideration to the robustness of the framework, minimising the loss of effort even after total loss of infrastructure, and allowing easy verification of every step of the distribution process. In contrast to most eScience applications, the input data and specification of the problem is very small, being easily given in a paragraph of text. The unique challenges our framework tackles are related to the combinatorial explosion of the space that contains the possible solutions and the robustness of long-running computations. Not only is the time required to finish the computations unknown, but also the resource requirements may change during the course of the computation. We demonstrate the applicability of our framework by using it to solve a challenging and hitherto open problem in computational mathematics. The results demonstrate that our approach easily scales to computations of a size that would have been impossible to tackle in practice just a decade ago.