cgSpan: Closed Graph-Based Substructure Pattern Mining
This is an incremental improvement for researchers in graph mining, focusing on efficiency in closed subgraph extraction.
The paper tackled the problem of mining frequent subgraphs by extending gSpan to only mine closed subgraphs, resulting in a new algorithm called cgSpan that adds Early Termination pruning while maintaining original steps.
gSpan is a popular algorithm for mining frequent subgraphs. cgSpan (closed graph-based substructure pattern mining) is a gSpan extension that only mines closed subgraphs. A subgraph g is closed in the graphs database if there is no proper frequent supergraph of g that has equivalent occurrence with g. cgSpan adds the Early Termination pruning method to the gSpan pruning methods, while leaving the original gSpan steps unchanged. cgSpan also detects and handles cases in which Early Termination should not be applied. To the best of our knowledge, cgSpan is the first publicly available implementation for closed graphs mining