Ziwen Kan

LG
h-index11
7papers
3citations
Novelty45%
AI Score50

7 Papers

AIJun 4
TRACE: A Temporal Conditional Estimation for Multimodal Time Series Foundation Models

Ziwen Kan, Yishuo Chen, Kecheng Li et al.

Time series foundation models (TS-FMs) aim to learn generalizable temporal representations that can be adapted to a wide range of downstream tasks. In real-world multimodal settings, time series are frequently affected by temporal misalignment and partial modality missingness, where different modalities are observed at heterogeneous time scales or are partially absent. Existing approaches typically rely on naive imputation or masking strategies, which fail to account for cross-modal dependencies and often lead to misaligned or degraded representations. We propose TRACE, a conditional estimation paradigm for multimodal time series foundation model pipelines under missingness and irregular sampling, allowing incomplete target modalities to be systematically inferred from available auxiliary modalities. We evaluate TRACE on diverse multimodal benchmarks spanning healthcare and affective computing, including the MIMIC-IV clinical dataset and the CMU-MOSI and CMU-MOSEI benchmarks for multimodal sentiment analysis. Across a range of downstream prediction tasks and missing-modality settings, TRACE consistently outperforms prior multimodal fusion approaches, demonstrating improved robustness to severe modality missingness and more reliable cross-modal representations.

LGJun 4
PAMF: Prior-Aware Multimodal Fusion for Incomplete Time Series Data

Ziwen Kan, Wugeng Zheng, Tianlong Chen et al.

In healthcare, multimodal time series tasks often operate on incomplete observations in practice, for example when ECG segments are lost because electrodes detach or an entire respiratory channel is unavailable during overnight monitoring. Such missingness typically appears in two structurally distinct patterns: within-modality missing, where values are absent within an otherwise observed modality, and modality-level missing, where an entire modality is unavailable. Existing methods typically represent unobserved data implicitly through masks or missing embeddings, without learning instance-specific missing information, and most are designed for only one missingness pattern. A natural approach is to explicitly estimate the missing data; however, existing imputation methods treat missingness uniformly despite their different structural priors, and the imputation process is often isolated from downstream tasks, preventing downstream tasks from guiding imputation toward more informative representations. To address these limitations, we present PAMF, a multimodal time-series framework that explicitly handles different missingness patterns while coupling imputation with downstream prediction through prior-aware flow matching and weight sharing. Specifically, the method initializes the flow-matching source state with type-specific priors to distinguish two missing types. It further connects imputation and classification through architecturally matched encoders with weight sharing, transferring task-relevant representations into the imputation process. Experiments on multiple multimodal healthcare time-series benchmarks show that the proposed method achieves the strongest overall downstream performance across diverse datasets and missing settings compared with existing baselines.

LGApr 25Code
Conditional Imputation for Within-Modality Missingness in Multi-Modal Federated Learning

Wugeng Zheng, Ziwen Kan, Katie Wang et al.

Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative training, but real-world clinical applications often suffer from within-modality missingness caused by sensor intermittency or irregular sampling. Existing methods implicitly represent unobserved data via architectural alignment or missing embeddings, often failing to recover the true distribution and yielding sub-optimal performance. We propose CondI, a federated framework explicitly addressing this missingness using conditional diffusion models. CondI employs a two-phase training pipeline: first, imputing unobserved temporal components using available multimodal context and conditional embeddings; second, optimizing modality-specific extractors and joint embedding spaces. During inference, imputed raw data pass through trained extractors to generate robust features, providing a holistic representation for downstream tasks. Explicit data imputation ensures models operate on complete semantic structures, significantly enhancing resilience against severe data incompleteness. Experiments on three clinical datasets (PTB-XL, SLEEP-EDF, MIMIC-IV) demonstrate CondI achieves comparable results to state-of-the-art baselines. Code: https://github.com/ZhengWugeng/CondI

LGAug 22, 2024
Benchmarking Counterfactual Interpretability in Deep Learning Models for Time Series Classification

Ziwen Kan, Shahbaz Rezaei, Xin Liu

The popularity of deep learning methods in the time series domain boosts interest in interpretability studies, including counterfactual (CF) methods. CF methods identify minimal changes in instances to alter the model predictions. Despite extensive research, no existing work benchmarks CF methods in the time series domain. Additionally, the results reported in the literature are inconclusive due to the limited number of datasets and inadequate metrics. In this work, we redesign quantitative metrics to accurately capture desirable characteristics in CFs. We specifically redesign the metrics for sparsity and plausibility and introduce a new metric for consistency. Combined with validity, generation time, and proximity, we form a comprehensive metric set. We systematically benchmark 6 different CF methods on 20 univariate datasets and 10 multivariate datasets with 3 different classifiers. Results indicate that the performance of CF methods varies across metrics and among different models. Finally, we provide case studies and a guideline for practical usage.

LGMay 13
MuteBench: Modality Unavailability Tolerance Evaluation for Incomplete Multimodal Fusion

Wugeng Zheng, Ziwen Kan, Tianlong Chen et al.

Multimodal physiological data powers clinical AI systems from intensive care units to wearable devices, but sensors routinely fail in practice. Two failure modes are common: modality missing, where an entire channel is absent, and within-modality missing, where a contiguous time segment is lost. No existing benchmark evaluates multiple fusion architectures under both failure modes at controlled severity levels across diverse clinical datasets. We present MuteBench, a benchmark covering 9 datasets from 7 clinical domains, 6 fusion architectures, and 2 missing-data modes over 125,000 samples. Through this benchmark, we find that architecture family is the strongest predictor of robustness, outweighing parameter count. Channel-independent models tolerate modality missing well but can be sensitive to within-modality missing, especially on short sequences. Curriculum modality dropout protects reliably only up to the maximum dropout rate used in training. We also find that channel count, sequence length, and modality alignment jointly determine which failure mode poses the greater threat. Finally, a PTB-XL case study suggests that diffusion-based imputation can improve downstream classification under within-modality missing, with the largest gains for models whose expert routing is most sensitive to corrupted inputs, though broader validation across datasets remains an open direction. MuteBench provides practitioners with concrete guidance for both selecting existing architectures and informing the design of future robust multimodal fusion methods.

LGMay 13, 2025Code
Implet: A Post-hoc Subsequence Explainer for Time Series Models

Fanyu Meng, Ziwen Kan, Shahbaz Rezaei et al.

Explainability in time series models is crucial for fostering trust, facilitating debugging, and ensuring interpretability in real-world applications. In this work, we introduce Implet, a novel post-hoc explainer that generates accurate and concise subsequence-level explanations for time series models. Our approach identifies critical temporal segments that significantly contribute to the model's predictions, providing enhanced interpretability beyond traditional feature-attribution methods. Based on it, we propose a cohort-based (group-level) explanation framework designed to further improve the conciseness and interpretability of our explanations. We evaluate Implet on several standard time-series classification benchmarks, demonstrating its effectiveness in improving interpretability. The code is available at https://github.com/LbzSteven/implet

LGJun 17, 2024Code
ChaosMining: A Benchmark to Evaluate Post-Hoc Local Attribution Methods in Low SNR Environments

Ge Shi, Ziwen Kan, Jason Smucny et al.

In this study, we examine the efficacy of post-hoc local attribution methods in identifying features with predictive power from irrelevant ones in domains characterized by a low signal-to-noise ratio (SNR), a common scenario in real-world machine learning applications. We developed synthetic datasets encompassing symbolic functional, image, and audio data, incorporating a benchmark on the {\it (Model \(\times\) Attribution\(\times\) Noise Condition)} triplet. By rigorously testing various classic models trained from scratch, we gained valuable insights into the performance of these attribution methods in multiple conditions. Based on these findings, we introduce a novel extension to the notable recursive feature elimination (RFE) algorithm, enhancing its applicability for neural networks. Our experiments highlight its strengths in prediction and feature selection, alongside limitations in scalability. Further details and additional minor findings are included in the appendix, with extensive discussions. The codes and resources are available at \href{https://github.com/geshijoker/ChaosMining/}{URL}.