IVJul 22, 2025Code
A High Magnifications Histopathology Image Dataset for Oral Squamous Cell Carcinoma Diagnosis and PrognosisJinquan Guan, Junhong Guo, Qi Chen et al.
Oral Squamous Cell Carcinoma (OSCC) is a prevalent and aggressive malignancy where deep learning-based computer-aided diagnosis and prognosis can enhance clinical assessments.However, existing publicly available OSCC datasets often suffer from limited patient cohorts and a restricted focus on either diagnostic or prognostic tasks, limiting the development of comprehensive and generalizable models. To bridge this gap, we introduce Multi-OSCC, a new histopathology image dataset comprising 1,325 OSCC patients, integrating both diagnostic and prognostic information to expand existing public resources. Each patient is represented by six high resolution histopathology images captured at x200, x400, and x1000 magnifications-two per magnification-covering both the core and edge tumor regions.The Multi-OSCC dataset is richly annotated for six critical clinical tasks: recurrence prediction (REC), lymph node metastasis (LNM), tumor differentiation (TD), tumor invasion (TI), cancer embolus (CE), and perineural invasion (PI). To benchmark this dataset, we systematically evaluate the impact of different visual encoders, multi-image fusion techniques, stain normalization, and multi-task learning frameworks. Our analysis yields several key insights: (1) The top-performing models achieve excellent results, with an Area Under the Curve (AUC) of 94.72% for REC and 81.23% for TD, while all tasks surpass 70% AUC; (2) Stain normalization benefits diagnostic tasks but negatively affects recurrence prediction; (3) Multi-task learning incurs a 3.34% average AUC degradation compared to single-task models in our multi-task benchmark, underscoring the challenge of balancing multiple tasks in our dataset. To accelerate future research, we publicly release the Multi-OSCC dataset and baseline models at https://github.com/guanjinquan/OSCC-PathologyImageDataset.
LGJun 13, 2024Code
Advancing Graph Generation through Beta DiffusionXinyang Liu, Yilin He, Bo Chen et al.
Diffusion models have excelled in generating natural images and are now being adapted to a variety of data types, including graphs. However, conventional models often rely on Gaussian or categorical diffusion processes, which can struggle to accommodate the mixed discrete and continuous components characteristic of graph data. Graphs typically feature discrete structures and continuous node attributes that often exhibit rich statistical patterns, including sparsity, bounded ranges, skewed distributions, and long-tailed behavior. To address these challenges, we introduce Graph Beta Diffusion (GBD), a generative model specifically designed to handle the diverse nature of graph data. GBD leverages a beta diffusion process, effectively modeling both continuous and discrete elements. Additionally, we propose a modulation technique that enhances the realism of generated graphs by stabilizing critical graph topology while maintaining flexibility for other components. GBD competes strongly with existing models across multiple general and biochemical graph benchmarks, showcasing its ability to capture the intricate balance between discrete and continuous features inherent in real-world graph data. The PyTorch code is available on GitHub.
CVMay 27, 2025
YOLO-SPCI: Enhancing Remote Sensing Object Detection via Selective-Perspective-Class IntegrationXinyuan Wang, Lian Peng, Xiangcheng Li et al.
Object detection in remote sensing imagery remains a challenging task due to extreme scale variation, dense object distributions, and cluttered backgrounds. While recent detectors such as YOLOv8 have shown promising results, their backbone architectures lack explicit mechanisms to guide multi-scale feature refinement, limiting performance on high-resolution aerial data. In this work, we propose YOLO-SPCI, an attention-enhanced detection framework that introduces a lightweight Selective-Perspective-Class Integration (SPCI) module to improve feature representation. The SPCI module integrates three components: a Selective Stream Gate (SSG) for adaptive regulation of global feature flow, a Perspective Fusion Module (PFM) for context-aware multi-scale integration, and a Class Discrimination Module (CDM) to enhance inter-class separability. We embed two SPCI blocks into the P3 and P5 stages of the YOLOv8 backbone, enabling effective refinement while preserving compatibility with the original neck and head. Experiments on the NWPU VHR-10 dataset demonstrate that YOLO-SPCI achieves superior performance compared to state-of-the-art detectors.
MLFeb 7, 2022
A Variational Edge Partition Model for Supervised Graph Representation LearningYilin He, Chaojie Wang, Hao Zhang et al.
Graph neural networks (GNNs), which propagate the node features through the edges and learn how to transform the aggregated features under label supervision, have achieved great success in supervised feature extraction for both node-level and graph-level classification tasks. However, GNNs typically treat the graph structure as given and ignore how the edges are formed. This paper introduces a graph generative process to model how the observed edges are generated by aggregating the node interactions over a set of overlapping node communities, each of which contributes to the edges via a logical OR mechanism. Based on this generative model, we partition each edge into the summation of multiple community-specific weighted edges and use them to define community-specific GNNs. A variational inference framework is proposed to jointly learn a GNN-based inference network that partitions the edges into different communities, these community-specific GNNs, and a GNN-based predictor that combines community-specific GNNs for the end classification task. Extensive evaluations on real-world graph datasets have verified the effectiveness of the proposed method in learning discriminative representations for both node-level and graph-level classification tasks.