LGJul 19, 2024Code
SurvReLU: Inherently Interpretable Survival Analysis via Deep ReLU NetworksXiaotong Sun, Peijie Qiu, Shengfan Zhang
Survival analysis models time-to-event distributions with censorship. Recently, deep survival models using neural networks have dominated due to their representational power and state-of-the-art performance. However, their "black-box" nature hinders interpretability, which is crucial in real-world applications. In contrast, "white-box" tree-based survival models offer better interpretability but struggle to converge to global optima due to greedy expansion. In this paper, we bridge the gap between previous deep survival models and traditional tree-based survival models through deep rectified linear unit (ReLU) networks. We show that a deliberately constructed deep ReLU network (SurvReLU) can harness the interpretability of tree-based structures with the representational power of deep survival models. Empirical studies on both simulated and real survival benchmark datasets show the effectiveness of the proposed SurvReLU in terms of performance and interoperability. The code is available at \href{https://github.com/xs018/SurvReLU}{\color{magenta}{ https://github.com/xs018/SurvReLU}}.
SIMay 9
Substitution or Complement? Uncovering the Interplay between Ride-hailing Services and Public TransitZhicheng Jin, Xiaotong Sun, Li Zhen et al.
The literature on transportation network companies (TNCs), also known as ride-hailing services, has often characterized these service providers as predominantly substitutive to public transit (PT). However, as TNC markets expand and mature, the complementary and substitutive relationships with PT may shift. To explore whether such a transformation is occurring, this study collected travel data from 96,716 ride-hailing vehicles during September 2022 in Shanghai, a city characterized by an increasingly saturated TNC market. An enhanced data-driven framework is proposed to classify TNC-PT relationships into four types: first-mile complementary, last-mile complementary, substitutive, and independent. Our findings reveal a substantial increase in the complementary ratio (9.22%) and a relative decline in the substitutive ratio (9.06%) compared to previous studies. Furthermore, to examine the nonlinear impact of various influential factors on these ratios, a machine learning method integrating categorical boosting (CatBoost) and Shapley additive explanations (SHAP) is proposed. The results show significant nonlinear effects in some variables, including the distance to the nearest metro station and the density of bus stops.
MLSep 25, 2023
NSOTree: Neural Survival Oblique TreeXiaotong Sun, Peijie Qiu
Survival analysis is a statistical method employed to scrutinize the duration until a specific event of interest transpires, known as time-to-event information characterized by censorship. Recently, deep learning-based methods have dominated this field due to their representational capacity and state-of-the-art performance. However, the black-box nature of the deep neural network hinders its interpretability, which is desired in real-world survival applications but has been largely neglected by previous works. In contrast, conventional tree-based methods are advantageous with respect to interpretability, while consistently grappling with an inability to approximate the global optima due to greedy expansion. In this paper, we leverage the strengths of both neural networks and tree-based methods, capitalizing on their ability to approximate intricate functions while maintaining interpretability. To this end, we propose a Neural Survival Oblique Tree (NSOTree) for survival analysis. Specifically, the NSOTree was derived from the ReLU network and can be easily incorporated into existing survival models in a plug-and-play fashion. Evaluations on both simulated and real survival datasets demonstrated the effectiveness of the proposed method in terms of performance and interpretability.
IVJul 10, 2025Code
Cracking Instance Jigsaw Puzzles: An Alternative to Multiple Instance Learning for Whole Slide Image AnalysisXiwen Chen, Peijie Qiu, Wenhui Zhu et al.
While multiple instance learning (MIL) has shown to be a promising approach for histopathological whole slide image (WSI) analysis, its reliance on permutation invariance significantly limits its capacity to effectively uncover semantic correlations between instances within WSIs. Based on our empirical and theoretical investigations, we argue that approaches that are not permutation-invariant but better capture spatial correlations between instances can offer more effective solutions. In light of these findings, we propose a novel alternative to existing MIL for WSI analysis by learning to restore the order of instances from their randomly shuffled arrangement. We term this task as cracking an instance jigsaw puzzle problem, where semantic correlations between instances are uncovered. To tackle the instance jigsaw puzzles, we propose a novel Siamese network solution, which is theoretically justified by optimal transport theory. We validate the proposed method on WSI classification and survival prediction tasks, where the proposed method outperforms the recent state-of-the-art MIL competitors. The code is available at https://github.com/xiwenc1/MIL-JigsawPuzzles.
LGDec 29, 2024
Multimodal Variational Autoencoder: a Barycentric ViewPeijie Qiu, Wenhui Zhu, Sayantan Kumar et al.
Multiple signal modalities, such as vision and sounds, are naturally present in real-world phenomena. Recently, there has been growing interest in learning generative models, in particular variational autoencoder (VAE), to for multimodal representation learning especially in the case of missing modalities. The primary goal of these models is to learn a modality-invariant and modality-specific representation that characterizes information across multiple modalities. Previous attempts at multimodal VAEs approach this mainly through the lens of experts, aggregating unimodal inference distributions with a product of experts (PoE), a mixture of experts (MoE), or a combination of both. In this paper, we provide an alternative generic and theoretical formulation of multimodal VAE through the lens of barycenter. We first show that PoE and MoE are specific instances of barycenters, derived by minimizing the asymmetric weighted KL divergence to unimodal inference distributions. Our novel formulation extends these two barycenters to a more flexible choice by considering different types of divergences. In particular, we explore the Wasserstein barycenter defined by the 2-Wasserstein distance, which better preserves the geometry of unimodal distributions by capturing both modality-specific and modality-invariant representations compared to KL divergence. Empirical studies on three multimodal benchmarks demonstrated the effectiveness of the proposed method.