Hsing-Huan Chung

LG
h-index44
4papers
5citations
Novelty55%
AI Score46

4 Papers

LGMay 13
MILM: Large Language Models for Multimodal Irregular Time Series with Informative Sampling

Hsing-Huan Chung, Shijun Li, Yoav Wald et al.

Multimodal irregular time series (MITS) consist of asynchronous and irregularly sampled observations from heterogeneous numerical and textual channels. In healthcare, for example, patients' electronic health records (EHR) include irregular lab measurements and clinical notes. The irregular timing and channel patterns of observations carry predictive signal alongside the numerical values and textual content. LLMs are natural candidates for processing such heterogeneous data, given their extensive pretrained knowledge spanning textual and numerical domains. We introduce MILM (Multimodal Irregular time series Language Model), which represents MITS as time-ordered triplets in Extensible Markup Language (XML) format and fine-tunes an LLM through a two-stage strategy for MITS classification. The first stage trains on value-redacted MITS to predict from sampling patterns alone, and the second stage trains on full MITS to jointly model sampling patterns and observed values. Our two-stage model (MILM-2S) and its single-stage counterpart (MILM-Direct) achieve the best and second-best average performance on multiple EHR datasets. Further value redaction evaluations confirm that sampling patterns carry predictive signal and that MILM-2S learns to exploit them. In the value pending evaluation we introduce, where some values are unavailable at prediction time, MILM-2S outperforms MILM-Direct by a larger margin compared to standard evaluation. For MILM-2S, preserving the time and channel of value-pending observations as additional sampling information further improves in-hospital mortality prediction.

LGApr 1, 2024Code
Novel Node Category Detection Under Subpopulation Shift

Hsing-Huan Chung, Shravan Chaudhari, Yoav Wald et al.

In real-world graph data, distribution shifts can manifest in various ways, such as the emergence of new categories and changes in the relative proportions of existing categories. It is often important to detect nodes of novel categories under such distribution shifts for safety or insight discovery purposes. We introduce a new approach, Recall-Constrained Optimization with Selective Link Prediction (RECO-SLIP), to detect nodes belonging to novel categories in attributed graphs under subpopulation shifts. By integrating a recall-constrained learning framework with a sample-efficient link prediction mechanism, RECO-SLIP addresses the dual challenges of resilience against subpopulation shifts and the effective exploitation of graph structure. Our extensive empirical evaluation across multiple graph datasets demonstrates the superior performance of RECO-SLIP over existing methods. The experimental code is available at https://github.com/hsinghuan/novel-node-category-detection.

LGApr 10, 2025
Between Linear and Sinusoidal: Rethinking the Time Encoder in Dynamic Graph Learning

Hsing-Huan Chung, Shravan Chaudhari, Xing Han et al.

Dynamic graph learning is essential for applications involving temporal networks and requires effective modeling of temporal relationships. Seminal attention-based models like TGAT and DyGFormer rely on sinusoidal time encoders to capture temporal dependencies between edge events. Prior work justified sinusoidal encodings because their inner products depend on the time spans between events, which are crucial features for modeling inter-event relations. However, sinusoidal encodings inherently lose temporal information due to their many-to-one nature and therefore require high dimensions. In this paper, we rigorously study a simpler alternative: the linear time encoder, which avoids temporal information loss caused by sinusoidal functions and reduces the need for high-dimensional time encoders. We show that the self-attention mechanism can effectively learn to compute time spans between events from linear time encodings and extract relevant temporal patterns. Through extensive experiments on six dynamic graph datasets, we demonstrate that the linear time encoder improves the performance of TGAT and DyGFormer in most cases. Moreover, the linear time encoder can lead to significant savings in model parameters with minimal performance loss. For example, compared to a 100-dimensional sinusoidal time encoder, TGAT with a 2-dimensional linear time encoder saves 43% of parameters and achieves higher average precision on five datasets. While both encoders can be used simultaneously, our study highlights the often-overlooked advantages of linear time features in modern dynamic graph models. These findings can positively impact the design choices of various dynamic graph learning architectures and eventually benefit temporal network applications such as recommender systems, communication networks, and traffic forecasting.

LGSep 30, 2025
Guiding Mixture-of-Experts with Temporal Multimodal Interactions

Xing Han, Hsing-Huan Chung, Joydeep Ghosh et al.

Mixture-of-Experts (MoE) architectures have become pivotal for large-scale multimodal models. However, their routing mechanisms typically overlook the informative, time-varying interaction dynamics between modalities. This limitation hinders expert specialization, as the model cannot explicitly leverage intrinsic modality relationships for effective reasoning. To address this, we propose a novel framework that guides MoE routing using quantified temporal interaction. A multimodal interaction-aware router learns to dispatch tokens to experts based on the nature of their interactions. This dynamic routing encourages experts to acquire generalizable interaction-processing skills rather than merely learning task-specific features. Our framework builds on a new formulation of temporal multimodal interaction dynamics, which are used to guide expert routing. We first demonstrate that these temporal multimodal interactions reveal meaningful patterns across applications, and then show how they can be leveraged to improve both the design and performance of MoE-based models. Comprehensive experiments on challenging multimodal benchmarks validate our approach, demonstrating both enhanced performance and improved interpretability.