Finding path motifs in large temporal graphs using algebraic fingerprints
This work addresses pattern-detection challenges in applications like tourism recommendations and financial fraud detection, though it is incremental as it builds on existing algebraic methods for graph problems.
The authors tackled the problem of detecting specific color patterns in large temporal graphs, which is NP-hard, and developed an algebraic-algorithmic framework that scales to graphs with up to a billion edges for queries with five colors, extracting optimum solutions in under eight minutes for real-world datasets.
We study a family of pattern-detection problems in vertex-colored temporal graphs. In particular, given a vertex-colored temporal graph and a multiset of colors as a query, we search for temporal paths in the graph that contain the colors specified in the query. These types of problems have several applications, for example in recommending tours for tourists or detecting abnormal behavior in a network of financial transactions. For the family of pattern-detection problems we consider, we establish complexity results and design an algebraic-algorithmic framework based on constrained multilinear sieving. We demonstrate that our solution scales to massive graphs with up to a billion edges for a multiset query with five colors and up to hundred million edges for a multiset query with ten colors, despite the problems being NP-hard. Our implementation, which is publicly available, exhibits practical edge-linear scalability and is highly optimized. For instance, in a real-world graph dataset with more than six million edges and a multiset query with ten colors, we can extract an optimum solution in less than eight minutes on a Haswell desktop with four cores.