IVJul 5, 2022
MMGL: Multi-Scale Multi-View Global-Local Contrastive learning for Semi-supervised Cardiac Image SegmentationZiyuan Zhao, Jinxuan Hu, Zeng Zeng et al.
With large-scale well-labeled datasets, deep learning has shown significant success in medical image segmentation. However, it is challenging to acquire abundant annotations in clinical practice due to extensive expertise requirements and costly labeling efforts. Recently, contrastive learning has shown a strong capacity for visual representation learning on unlabeled data, achieving impressive performance rivaling supervised learning in many domains. In this work, we propose a novel multi-scale multi-view global-local contrastive learning (MMGL) framework to thoroughly explore global and local features from different scales and views for robust contrastive learning performance, thereby improving segmentation performance with limited annotations. Extensive experiments on the MM-WHS dataset demonstrate the effectiveness of MMGL framework on semi-supervised cardiac image segmentation, outperforming the state-of-the-art contrastive learning methods by a large margin.
39.4ROMay 19
KIO-planner: Attention-Guided Single-Stage Motion Planning with Dual Mapping for UAV NavigationDexing Yao, Haochen Li, Junhao Wei et al.
Autonomous UAV flight in confined, wall-dense environments requires low-latency and reliable motion planning under strict safety constraints. Traditional optimization-based planners suffer from mapping latency and easily fall into local minima when navigating through dense structural obstacles. Meanwhile, existing end-to-end learning methods struggle to extract fine-grained geometric features from raw depth images and lack hard kinodynamic constraints, leading to unpredictable collisions near walls. To address these issues, we propose KIO-planner, an attention-guided single-stage trajectory planning framework. First, we integrate a Convolutional Block Attention Module (CBAM) into the perception backbone to adaptively focus on critical structural edges and traversable space. Second, we introduce a novel Dual Mapping mechanism--comprising physical bounds activation and a deterministic Geometric Safety Shield in the depth-pixel space--to enforce kinodynamic feasibility and collision-free flight without global map fusion. Extensive high-fidelity simulated experiments demonstrate that KIO-planner enables highly agile navigation at speeds up to 3.0 m/s. Compared to the state-of-the-art baseline, KIO-planner achieves lower inference latency (approximately 24 ms) and generates significantly smoother trajectories, reducing control cost by 28.4%. Most notably, our Dual Mapping substantially increases the worst-case safety margin, measured by minimum distance to obstacles, from 0.48 m to 0.76 m, ensuring fast, smooth, and safer navigation in highly constrained environments.
LGNov 22, 2021
Graph-Based Similarity of Neural Network RepresentationsZuohui Chen, Yao Lu, Jinxuan Hu et al.
Understanding the black-box representations in Deep Neural Networks (DNN) is an essential problem in deep learning. In this work, we propose Graph-Based Similarity (GBS) to measure the similarity of layer features. Contrary to previous works that compute the similarity directly on the feature maps, GBS measures the correlation based on the graph constructed with hidden layer outputs. By treating each input sample as a node and the corresponding layer output similarity as edges, we construct the graph of DNN representations for each layer. The similarity between graphs of layers identifies the correspondences between representations of models trained in different datasets and initializations. We demonstrate and prove the invariance property of GBS, including invariance to orthogonal transformation and invariance to isotropic scaling, and compare GBS with CKA. GBS shows state-of-the-art performance in reflecting the similarity and provides insights on explaining the adversarial sample behavior on the hidden layer space.