Jianming Huang

LG
h-index3
8papers
34citations
Novelty51%
AI Score35

8 Papers

LGJul 9, 2022
Wasserstein Graph Distance Based on $L_1$-Approximated Tree Edit Distance between Weisfeiler-Lehman Subtrees

Zhongxi Fang, Jianming Huang, Xun Su et al.

The Weisfeiler-Lehman (WL) test is a widely used algorithm in graph machine learning, including graph kernels, graph metrics, and graph neural networks. However, it focuses only on the consistency of the graph, which means that it is unable to detect slight structural differences. Consequently, this limits its ability to capture structural information, which also limits the performance of existing models that rely on the WL test. This limitation is particularly severe for traditional metrics defined by the WL test, which cannot precisely capture slight structural differences. In this paper, we propose a novel graph metric called the Wasserstein WL Subtree (WWLS) distance to address this problem. Our approach leverages the WL subtree as structural information for node neighborhoods and defines node metrics using the $L_1$-approximated tree edit distance ($L_1$-TED) between WL subtrees of nodes. Subsequently, we combine the Wasserstein distance and the $L_1$-TED to define the WWLS distance, which can capture slight structural differences that may be difficult to detect using conventional metrics. We demonstrate that the proposed WWLS distance outperforms baselines in both metric validation and graph classification experiments.

LGOct 24, 2023
Anchor Space Optimal Transport as a Fast Solution to Multiple Optimal Transport Problems

Jianming Huang, Xun Su, Zhongxi Fang et al.

In machine learning, Optimal Transport (OT) theory is extensively utilized to compare probability distributions across various applications, such as graph data represented by node distributions and image data represented by pixel distributions. In practical scenarios, it is often necessary to solve multiple OT problems. Traditionally, these problems are treated independently, with each OT problem being solved sequentially. However, the computational complexity required to solve a single OT problem is already substantial, making the resolution of multiple OT problems even more challenging. Although many applications of fast solutions to OT are based on the premise of a single OT problem with arbitrary distributions, few efforts handle such multiple OT problems with multiple distributions. Therefore, we propose the anchor space optimal transport (ASOT) problem: an approximate OT problem designed for multiple OT problems. This proposal stems from our finding that in many tasks the mass transport tends to be concentrated in a reduced space from the original feature space. By restricting the mass transport to a learned anchor point space, ASOT avoids pairwise instantiations of cost matrices for multiple OT problems and simplifies the problems by canceling insignificant transports. This simplification greatly reduces its computational costs. We then prove the upper bounds of its $1$-Wasserstein distance error between the proposed ASOT and the original OT problem under different conditions. Building upon this accomplishment, we propose three methods to learn anchor spaces for reducing the approximation error. Furthermore, our proposed methods present great advantages for handling distributions of different sizes with GPU parallelization.

LGAug 17, 2025
Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models

Xun Su, Jianming Huang, Yang Yusen et al.

Inference-time scaling has achieved remarkable success in language models, yet its adaptation to diffusion models remains underexplored. We observe that the efficacy of recent Sequential Monte Carlo (SMC)-based methods largely stems from globally fitting the The reward-tilted distribution, which inherently preserves diversity during multi-modal search. However, current applications of SMC to diffusion models face a fundamental dilemma: early-stage noise samples offer high potential for improvement but are difficult to evaluate accurately, whereas late-stage samples can be reliably assessed but are largely irreversible. To address this exploration-exploitation trade-off, we approach the problem from the perspective of the search algorithm and propose two strategies: Funnel Schedule and Adaptive Temperature. These simple yet effective methods are tailored to the unique generation dynamics and phase-transition behavior of diffusion models. By progressively reducing the number of maintained particles and down-weighting the influence of early-stage rewards, our methods significantly enhance sample quality without increasing the total number of Noise Function Evaluations. Experimental results on multiple benchmarks and state-of-the-art text-to-image diffusion models demonstrate that our approach outperforms previous baselines.

LGFeb 3, 2025
Self-supervised Subgraph Neural Network With Deep Reinforcement Walk Exploration

Jianming Huang, Hiroyuki Kasai

Graph data, with its structurally variable nature, represents complex real-world phenomena like chemical compounds, protein structures, and social networks. Traditional Graph Neural Networks (GNNs) primarily utilize the message-passing mechanism, but their expressive power is limited and their prediction lacks explainability. To address these limitations, researchers have focused on graph substructures. Subgraph neural networks (SGNNs) and GNN explainers have emerged as potential solutions, but each has its limitations. SGNNs computes graph representations based on the bags of subgraphs to enhance the expressive power. However, they often rely on predefined algorithm-based sampling strategies, which is inefficient. GNN explainers adopt data-driven approaches to generate important subgraphs to provide explanation. Nevertheless, their explanation is difficult to be translated into practical improvements on GNNs. To overcome these issues, we propose a novel self-supervised framework that integrates SGNNs with the generation approach of GNN explainers, named the Reinforcement Walk Exploration SGNN (RWE-SGNN). Our approach features a sampling model trained in an explainer fashion, optimizing subgraphs to enhance model performance. To achieve a data-driven sampling approach, unlike traditional subgraph generation approaches, we propose a novel walk exploration process, which efficiently extracts important substructures, simplifying the embedding process and avoiding isomorphism problems. Moreover, we prove that our proposed walk exploration process has equivalent generation capability to the traditional subgraph generation process. Experimental results on various graph datasets validate the effectiveness of our proposed method, demonstrating significant improvements in performance and precision.

CRMar 5, 2021
Update the Root of Integrity Tree in Secure Non-Volatile Memory Systems with Low Overhead

Jianming Huang, Yu Hua

Data integrity is important for non-volatile memory (NVM) systems that maintain data even without power. The data integrity in NVM is possibly compromised by integrity attacks, which can be defended against by integrity verification via integrity trees. After NVM system failures and reboots, the integrity tree root is responsible for providing a trusted execution environment. However, the root often becomes a performance bottleneck, since updating the root requires high latency on the write critical path to propagate the modifications from leaf nodes to the root. The root and leaf nodes have to ensure the crash consistency between each other to avoid any update failures that potentially result in misreporting the attacks after system reboots. In this paper, we propose an efficient and low-latency scheme, called SCUE, to directly update the root on the SGX integrity tree (SIT) by overlooking the updates upon the intermediate tree nodes. The idea behind SCUE explores and exploits the observation that only the persistent leaf nodes and root are useful to ensure the integrity after system failures and reboots, due to the loss of the cached intermediate tree nodes. To achieve the crash consistency between root and leaf nodes, we accurately predict the updates upon the root and pre-update the root before the leaf nodes are modified. Moreover, the SIT root is difficult to be reconstructed from the leaf nodes since updating one tree node needs its parent node as input. We use a counter-summing approach to reconstructing the SIT from leaf nodes. Our evaluation results show that compared with the state-of-the-art integrity tree update schemes, our SCUE scheme delivers high performance while ensuring the system integrity.

LGDec 7, 2020
LCS Graph Kernel Based on Wasserstein Distance in Longest Common Subsequence Metric Space

Jianming Huang, Zhongxi Fang, Hiroyuki Kasai

For graph learning tasks, many existing methods utilize a message-passing mechanism where vertex features are updated iteratively by aggregation of neighbor information. This strategy provides an efficient means for graph features extraction, but obtained features after many iterations might contain too much information from other vertices, and tend to be similar to each other. This makes their representations less expressive. Learning graphs using paths, on the other hand, can be less adversely affected by this problem because it does not involve all vertex neighbors. However, most of them can only compare paths with the same length, which might engender information loss. To resolve this difficulty, we propose a new Graph Kernel based on a Longest Common Subsequence (LCS) similarity. Moreover, we found that the widely-used R-convolution framework is unsuitable for path-based Graph Kernel because a huge number of comparisons between dissimilar paths might deteriorate graph distances calculation. Therefore, we propose a novel metric space by exploiting the proposed LCS-based similarity, and compute a new Wasserstein-based graph distance in this metric space, which emphasizes more the comparison between similar paths. Furthermore, to reduce the computational cost, we propose an adjacent point merging operation to sparsify point clouds in the metric space.

LGOct 28, 2020
Graph embedding using multi-layer adjacent point merging model

Jianming Huang, Hiroyuki Kasai

For graph classification tasks, many traditional kernel methods focus on measuring the similarity between graphs. These methods have achieved great success on resolving graph isomorphism problems. However, in some classification problems, the graph class depends on not only the topological similarity of the whole graph, but also constituent subgraph patterns. To this end, we propose a novel graph embedding method using a multi-layer adjacent point merging model. This embedding method allows us to extract different subgraph patterns from train-data. Then we present a flexible loss function for feature selection which enhances the robustness of our method for different classification problems. Finally, numerical evaluations demonstrate that our proposed method outperforms many state-of-the-art methods.

CRDec 10, 2019
A Write-Friendly and Fast-Recovery Scheme for Security Metadata in NVM

Jianming Huang, Yu Hua

Non-Volatile Memories (NVMs) have attracted the attentions of academia and industry, which is expected to become the next-generation memory. However, due to the nonvolatile property, NVMs become vulnerable to attacks and require security mechanisms, e.g., counter mode encryption and integrity tree, which introduce the security metadata. NVMs promise to recover these security metadata after a system crash, including the counter and integrity tree. However, unlike merkle tree reconstructed from user data, recovering SGX integrity tree (SIT) has to address the challenges from unique top-down hierarchical dependency. Moreover, writing overhead and recovery time are important metrics for evaluating persistent memory system due to the high costs of NVM writes and IT downtime. How to recover the security metadata, i.e., counter blocks and integrity tree nodes, with low write overhead and short recovery time, becomes much important. To provide a fast recovery scheme with low write overhead, we propose STAR, a cost-efficient scheme for recovering counter blocks and SGX integrity tree nodes after crashes. For fast recovery and verification, STAR synergizes the MAC and correct data, uses bitmap lines in ADR to indicate the location of stale node and constructs a cached merkle tree to verify the correctness of the recovery process. Moreover, STAR uses a multi-layer index to speed up the recovery process. STAR also allows different configurations to meet adaptive requirements for write overhead and recovery time. Our evaluation results show that the proposed STAR reduces the number of memory writes by up to 87\% compared with state-of-the-art work, Anubis, which needs extra 1x memory writes. For a 4MB security metadata cache, STAR needs 0.039s/0.023s/0.004s in three different configurations to recover the metadata cache while Anubis needs 0.020s.