Yiqi Jin

2papers

2 Papers

CVNov 20, 2022
ECM-OPCC: Efficient Context Model for Octree-based Point Cloud Compression

Yiqi Jin, Ziyu Zhu, Tongda Xu et al.

Recently, deep learning methods have shown promising results in point cloud compression. For octree-based point cloud compression, previous works show that the information of ancestor nodes and sibling nodes are equally important for predicting current node. However, those works either adopt insufficient context or bring intolerable decoding complexity (e.g. >600s). To address this problem, we propose a sufficient yet efficient context model and design an efficient deep learning codec for point clouds. Specifically, we first propose a window-constrained multi-group coding strategy to exploit the autoregressive context while maintaining decoding efficiency. Then, we propose a dual transformer architecture to utilize the dependency of current node on its ancestors and siblings. We also propose a random-masking pre-train method to enhance our model. Experimental results show that our approach achieves state-of-the-art performance for both lossy and lossless point cloud compression. Moreover, our multi-group coding strategy saves 98% decoding time compared with previous octree-based compression method.

IVJun 7, 2024Code
MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome

Yixin Huang, Yiqi Jin, Ke Tao et al.

May-Thurner Syndrome (MTS) is a vascular condition that affects over 20\% of the population and significantly increases the risk of iliofemoral deep venous thrombosis. Accurate and early diagnosis of MTS using computed tomography (CT) remains a clinical challenge due to the subtle anatomical compression and variability across patients. In this paper, we propose MTS-Net, an end-to-end 3D deep learning framework designed to capture spatial-temporal patterns from CT volumes for reliable MTS diagnosis. MTS-Net builds upon 3D ResNet-18 by embedding a novel dual-enhanced positional multi-head self-attention (DEP-MHSA) module into the Transformer encoder of the network's final stages. The proposed DEP-MHSA employs multi-scale convolution and integrates positional embeddings into both attention weights and residual paths, enhancing spatial context preservation, which is crucial for identifying venous compression. To validate our approach, we curate the first publicly available dataset for MTS, MTS-CT, containing over 747 gender-balanced subjects with standard and enhanced CT scans. Experimental results demonstrate that MTS-Net achieves average 0.79 accuracy, 0.84 AUC, and 0.78 F1-score, outperforming baseline models including 3D ResNet, DenseNet-BC, and BabyNet. Our work not only introduces a new diagnostic architecture for MTS but also provides a high-quality benchmark dataset to facilitate future research in automated vascular syndrome detection. We make our code and dataset publicly available at:https://github.com/Nutingnon/MTS_dep_mhsa.