Hans Hao-Hsun Hsu

LG
h-index4
6papers
58citations
Novelty51%
AI Score46

6 Papers

LGOct 12, 2022
What Makes Graph Neural Networks Miscalibrated?

Hans Hao-Hsun Hsu, Yuesong Shen, Christian Tomani et al.

Given the importance of getting calibrated predictions and reliable uncertainty estimations, various post-hoc calibration methods have been developed for neural networks on standard multi-class classification tasks. However, these methods are not well suited for calibrating graph neural networks (GNNs), which presents unique challenges such as accounting for the graph structure and the graph-induced correlations between the nodes. In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. In particular, we identify five factors which influence the calibration of GNNs: general under-confident tendency, diversity of nodewise predictive distributions, distance to training nodes, relative confidence level, and neighborhood similarity. Furthermore, based on the insights from this study, we design a novel calibration method named Graph Attention Temperature Scaling (GATS), which is tailored for calibrating graph neural networks. GATS incorporates designs that address all the identified influential factors and produces nodewise temperature scaling using an attention-based architecture. GATS is accuracy-preserving, data-efficient, and expressive at the same time. Our experiments empirically verify the effectiveness of GATS, demonstrating that it can consistently achieve state-of-the-art calibration results on various graph datasets for different GNN backbones.

LGOct 27, 2022
A Graph Is More Than Its Nodes: Towards Structured Uncertainty-Aware Learning on Graphs

Hans Hao-Hsun Hsu, Yuesong Shen, Daniel Cremers

Current graph neural networks (GNNs) that tackle node classification on graphs tend to only focus on nodewise scores and are solely evaluated by nodewise metrics. This limits uncertainty estimation on graphs since nodewise marginals do not fully characterize the joint distribution given the graph structure. In this work, we propose novel edgewise metrics, namely the edgewise expected calibration error (ECE) and the agree/disagree ECEs, which provide criteria for uncertainty estimation on graphs beyond the nodewise setting. Our experiments demonstrate that the proposed edgewise metrics can complement the nodewise results and yield additional insights. Moreover, we show that GNN models which consider the structured prediction problem on graphs tend to have better uncertainty estimations, which illustrates the benefit of going beyond the nodewise setting.

LGFeb 5Code
Accelerated Sequential Flow Matching: A Bayesian Filtering Perspective

Yinan Huang, Hans Hao-Hsun Hsu, Junran Wang et al.

Sequential prediction from streaming observations is a fundamental problem in stochastic dynamical systems, where inherent uncertainty often leads to multiple plausible futures. While diffusion and flow-matching models are capable of modeling complex, multi-modal trajectories, their deployment in real-time streaming environments typically relies on repeated sampling from a non-informative initial distribution, incurring substantial inference latency and potential system backlogs. In this work, we introduce Sequential Flow Matching, a principled framework grounded in Bayesian filtering. By treating streaming inference as learning a probability flow that transports the predictive distribution from one time step to the next, our approach naturally aligns with the recursive structure of Bayesian belief updates. We provide theoretical justification that initializing generation from the previous posterior offers a principled warm start that can accelerate sampling compared to naïve re-sampling. Across a wide range of forecasting, decision-making and state estimation tasks, our method achieves performance competitive with full-step diffusion while requiring only one or very few sampling steps, therefore with faster sampling. It suggests that framing sequential inference via Bayesian filtering provides a new and principled perspective towards efficient real-time deployment of flow-based models. Our code is available at https://github.com/Graph-COM/Sequential\_Flow\_Matching.

90.9SIMar 20Code
Can LLM Agents Simulate Dynamic Networks? A Case Study on Email Networks with Phishing Synthesis

Siqi Miao, Ziyang Chen, Yuhong Luo et al.

While Large Language Model (LLM) multi-agent systems (MAS) offer a transformative approach to simulating human behavior in complex systems, it remains largely unexplored whether these simulations can replicate realistic structural and temporal dynamics from a dynamic network perspective. Our evaluation indicates that existing frameworks excel at generating plausible micro-level interactions but fail to capture the emergent, macroscopic topologies necessary for domains that rely on realistic network dynamics, such as modeling information propagation and cybersecurity threats. To bridge this gap, we introduce two easily integrable extensions to simulation frameworks to ensure they preserve macroscopic network fidelity: 1) augmenting LLM agents with data-driven event triggers to organically sustain long-horizon interactions, and 2) integrating Hawkes processes to accurately model temporal activation dynamics. Our approach allows LLM MAS to capture both plausible micro-level patterns and macroscopic topologies. We further demonstrate the utility of this framework in synthesizing realistic phishing campaigns within evolving communication networks. The study reveals how threats exploit structural vulnerabilities, highlighting the potential of our framework for developing next-generation defenses. Our code is available at https://github.com/Graph-COM/NSL.

LGFeb 25, 2025
Structural Alignment Improves Graph Test-Time Adaptation

Hans Hao-Hsun Hsu, Shikun Liu, Han Zhao et al.

Graph-based learning excels at capturing interaction patterns in diverse domains like recommendation, fraud detection, and particle physics. However, its performance often degrades under distribution shifts, especially those altering network connectivity. Current methods to address these shifts typically require retraining with the source dataset, which is often infeasible due to computational or privacy limitations. We introduce Test-Time Structural Alignment (TSA), a novel algorithm for Graph Test-Time Adaptation (GTTA) that aligns graph structures during inference without accessing the source data. Grounded in a theoretical understanding of graph data distribution shifts, TSA employs three synergistic strategies: uncertainty-aware neighborhood weighting to accommodate neighbor label distribution shifts, adaptive balancing of self-node and aggregated neighborhood representations based on their signal-to-noise ratio, and decision boundary refinement to correct residual label and feature shifts. Extensive experiments on synthetic and real-world datasets demonstrate TSA's consistent outperformance of both non-graph TTA methods and state-of-the-art GTTA baselines.

LGNov 27, 2021
Automated Antenna Testing Using Encoder-Decoder-based Anomaly Detection

Hans Hao-Hsun Hsu, Jiawen Xu, Ravi Sama et al.

We propose a new method for testing antenna arrays that records the radiating electromagnetic (EM) field using an absorbing material and evaluating the resulting thermal image series through an AI using a conditional encoder-decoder model. Given the power and phase of the signals fed into each array element, we are able to reconstruct normal sequences through our trained model and compare it to the real sequences observed by a thermal camera. These thermograms only contain low-level patterns such as blobs of various shapes. A contour-based anomaly detector can then map the reconstruction error matrix to an anomaly score to identify faulty antenna arrays and increase the classification F-measure (F-M) by up to 46%. We show our approach on the time series thermograms collected by our antenna testing system. Conventionally, a variational autoencoder (VAE) learning observation noise may yield better results than a VAE with a constant noise assumption. However, we demonstrate that this is not the case for anomaly detection on such low-level patterns for two reasons. First, the baseline metric reconstruction probability, which incorporates the learned observation noise, fails to differentiate anomalous patterns. Second, the area under the receiver operating characteristic (ROC) curve of a VAE with a lower observation noise assumption achieves 11.83% higher than that of a VAE with learned noise.