Xinya Wang

CV
h-index21
5papers
5citations
Novelty51%
AI Score38

5 Papers

CVNov 13, 2025
Utility of Pancreas Surface Lobularity as a CT Biomarker for Opportunistic Screening of Type 2 Diabetes

Tejas Sudharshan Mathai, Anisa V. Prasad, Xinya Wang et al.

Type 2 Diabetes Mellitus (T2DM) is a chronic metabolic disease that affects millions of people worldwide. Early detection is crucial as it can alter pancreas function through morphological changes and increased deposition of ectopic fat, eventually leading to organ damage. While studies have shown an association between T2DM and pancreas volume and fat content, the role of increased pancreatic surface lobularity (PSL) in patients with T2DM has not been fully investigated. In this pilot work, we propose a fully automated approach to delineate the pancreas and other abdominal structures, derive CT imaging biomarkers, and opportunistically screen for T2DM. Four deep learning-based models were used to segment the pancreas in an internal dataset of 584 patients (297 males, 437 non-diabetic, age: 45$\pm$15 years). PSL was automatically detected and it was higher for diabetic patients (p=0.01) at 4.26 $\pm$ 8.32 compared to 3.19 $\pm$ 3.62 for non-diabetic patients. The PancAP model achieved the highest Dice score of 0.79 $\pm$ 0.17 and lowest ASSD error of 1.94 $\pm$ 2.63 mm (p$<$0.05). For predicting T2DM, a multivariate model trained with CT biomarkers attained 0.90 AUC, 66.7\% sensitivity, and 91.9\% specificity. Our results suggest that PSL is useful for T2DM screening and could potentially help predict the early onset of T2DM.

CVApr 8, 2024Code
Detecting Every Object from Events

Haitian Zhang, Chang Xu, Xinya Wang et al.

Object detection is critical in autonomous driving, and it is more practical yet challenging to localize objects of unknown categories: an endeavour known as Class-Agnostic Object Detection (CAOD). Existing studies on CAOD predominantly rely on ordinary cameras, but these frame-based sensors usually have high latency and limited dynamic range, leading to safety risks in real-world scenarios. In this study, we turn to a new modality enabled by the so-called event camera, featured by its sub-millisecond latency and high dynamic range, for robust CAOD. We propose Detecting Every Object in Events (DEOE), an approach tailored for achieving high-speed, class-agnostic open-world object detection in event-based vision. Built upon the fast event-based backbone: recurrent vision transformer, we jointly consider the spatial and temporal consistencies to identify potential objects. The discovered potential objects are assimilated as soft positive samples to avoid being suppressed as background. Moreover, we introduce a disentangled objectness head to separate the foreground-background classification and novel object discovery tasks, enhancing the model's generalization in localizing novel objects while maintaining a strong ability to filter out the background. Extensive experiments confirm the superiority of our proposed DEOE in comparison with three strong baseline methods that integrate the state-of-the-art event-based object detector with advancements in RGB-based CAOD. Our code is available at https://github.com/Hatins/DEOE.

IVJan 23, 2025
Segment-and-Classify: ROI-Guided Generalizable Contrast Phase Classification in CT Using XGBoost

Benjamin Hou, Tejas Sudharshan Mathai, Pritam Mukherjee et al.

Purpose: To automate contrast phase classification in CT using organ-specific features extracted from a widely used segmentation tool with a lightweight decision tree classifier. Materials and Methods: This retrospective study utilized three public CT datasets from separate institutions. The phase prediction model was trained on the WAW-TACE (median age: 66 [60,73]; 185 males) dataset, and externally validated on the VinDr-Multiphase (146 males; 63 females; 56 unk) and C4KC-KiTS (median age: 61 [50.68; 123 males) datasets. Contrast phase classification was performed using organ-specific features extracted by TotalSegmentator, followed by prediction using a gradient-boosted decision tree classifier. Results: On the VinDr-Multiphase dataset, the phase prediction model achieved the highest or comparable AUCs across all phases (>0.937), with superior F1-scores in the non-contrast (0.994), arterial (0.937), and delayed (0.718) phases. Statistical testing indicated significant performance differences only in the arterial and delayed phases (p<0.05). On the C4KC-KiTS dataset, the phase prediction model achieved the highest AUCs across all phases (>0.991), with superior F1-scores in arterial/venous (0.968) and delayed (0.935) phases. Statistical testing confirmed significant improvements over all baseline models in these two phases (p<0.05). Performance in the non-contrast class, however, was comparable across all models, with no statistically significant differences observed (p>0.05). Conclusion: The lightweight model demonstrated strong performance relative to all baseline models, and exhibited robust generalizability across datasets from different institutions.

IVJan 23, 2025
Leveraging Multiphase CT for Quality Enhancement of Portal Venous CT: Utility for Pancreas Segmentation

Xinya Wang, Tejas Sudharshan Mathai, Boah Kim et al.

Multiphase CT studies are routinely obtained in clinical practice for diagnosis and management of various diseases, such as cancer. However, the CT studies can be acquired with low radiation doses, different scanners, and are frequently affected by motion and metal artifacts. Prior approaches have targeted the quality improvement of one specific CT phase (e.g., non-contrast CT). In this work, we hypothesized that leveraging multiple CT phases for the quality enhancement of one phase may prove advantageous for downstream tasks, such as segmentation. A 3D progressive fusion and non-local (PFNL) network was developed. It was trained with three degraded (low-quality) phases (non-contrast, arterial, and portal venous) to enhance the quality of the portal venous phase. Then, the effect of scan quality enhancement was evaluated using a proxy task of pancreas segmentation, which is useful for tracking pancreatic cancer. The proposed approach improved the pancreas segmentation by 3% over the corresponding low-quality CT scan. To the best of our knowledge, we are the first to harness multiphase CT for scan quality enhancement and improved pancreas segmentation.

CVDec 5, 2024
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames

Haitian Zhang, Xiangyuan Wang, Chang Xu et al.

Fusing Events and RGB images for object detection leverages the robustness of Event cameras in adverse environments and the rich semantic information provided by RGB cameras. However, two critical mismatches: low-latency Events \textit{vs.}~high-latency RGB frames; temporally sparse labels in training \textit{vs.}~continuous flow in inference, significantly hinder the high-frequency fusion-based object detection. To address these challenges, we propose the \textbf{F}requency-\textbf{A}daptive Low-Latency \textbf{O}bject \textbf{D}etector (FAOD). FAOD aligns low-frequency RGB frames with high-frequency Events through an Align Module, which reinforces cross-modal style and spatial proximity to address the Event-RGB Mismatch. We further propose a training strategy, Time Shift, which enforces the module to align the prediction from temporally shifted Event-RGB pairs and their original representation, that is, consistent with Event-aligned annotations. This strategy enables the network to use high-frequency Event data as the primary reference while treating low-frequency RGB images as supplementary information, retaining the low-latency nature of the Event stream toward high-frequency detection. Furthermore, we observe that these corrected Event-RGB pairs demonstrate better generalization from low training frequency to higher inference frequencies compared to using Event data alone. Extensive experiments on the PKU-DAVIS-SOD and DSEC-Detection datasets demonstrate that our FAOD achieves SOTA performance. Specifically, in the PKU-DAVIS-SOD Dataset, FAOD achieves 9.8 points improvement in terms of the mAP in fully paired Event-RGB data with only a quarter of the parameters compared to SODFormer, and even maintains robust performance (only a 3 points drop in mAP) under 80$\times$ Event-RGB frequency mismatch.