BigData Applications from Graph Analytics to Machine Learning by Aggregates in Recursion
This enables scalable BigData applications in logic-based languages, addressing a bottleneck in declarative programming for domains like graph analytics and machine learning, though it is incremental as it builds on existing PreM concepts.
The paper tackles the problem of using aggregates in recursion within logic programs, which was previously hindered by semantic issues, and shows that the Pre-mappability (PreM) notion enables efficient, scalable implementations while preserving declarative semantics. It demonstrates that PreM allows concise expression of various algorithms, such as graph analytics and machine learning, in declarative languages like Datalog and SQL, with improved performance.
In the past, the semantic issues raised by the non-monotonic nature of aggregates often prevented their use in the recursive statements of logic programs and deductive databases. However, the recently introduced notion of Pre-mappability (PreM) has shown that, in key applications of interest, aggregates can be used in recursion to optimize the perfect-model semantics of aggregate-stratified programs. Therefore we can preserve the declarative formal semantics of such programs while achieving a highly efficient operational semantics that is conducive to scalable implementations on parallel and distributed platforms. In this paper, we show that with PreM, a wide spectrum of classical algorithms of practical interest, ranging from graph analytics and dynamic programming based optimization problems to data mining and machine learning applications can be concisely expressed in declarative languages by using aggregates in recursion. Our examples are also used to show that PreM can be checked using simple techniques and templatized verification strategies. A wide range of advanced BigData applications can now be expressed declaratively in logic-based languages, including Datalog, Prolog, and even SQL, while enabling their execution with superior performance and scalability.