Xiaoyu Hao

6.4LGFeb 4, 2024Code

Pruner: A Draft-then-Verify Exploration Mechanism to Accelerate Tensor Program Tuning

Liang Qiao, Jun Shi, Xiaoyu Hao et al.

Tensor program tuning is essential for the efficient deployment of deep neural networks. Search-based approaches have demonstrated scalability and effectiveness in automatically finding high-performance programs for specific hardware. However, the search process is often inefficient, taking hours or even days to discover optimal programs due to the exploration mechanisms guided by an accurate but slow-learned cost model. Meanwhile, the learned cost model trained on one platform cannot seamlessly adapt online to another, which we call cross-platform online unawareness. In this work, we propose Pruner and MoA-Pruner. Pruner is a "Draft-then-Verify" exploration mechanism that accelerates the schedule search process. Instead of applying the complex learned cost model to all explored candidates, Pruner drafts small-scale potential candidates by introducing a naive Symbol-based Analyzer (draft model), then identifies the best candidates by the learned cost model. MoA-Pruner introduces a Momentum online Adaptation strategy to address the cross-platform online unawareness. We incorporate Pruner into the TVM and conduct extensive experiments on three GPU-based platforms. Results show considerable speedup in schedule search time. In online tuning scenarios, Pruner and MoA-Pruner achieve an average speedup of $2.6 \times$ and $4.82 \times$ compared to Ansor. In offline tuning scenarios, Pruner achieves an average speedup of $4.75 \times$ and $4.05\times$ compared to TenSet and TLP, respectively. Furthermore, Pruner achieves an average speedup of $4.08 \times$ compared to MetaSchedule on TensorCore.

4.4IVMay 14, 2021Code

DARNet: Dual-Attention Residual Network for Automatic Diagnosis of COVID-19 via CT Images

Jun Shi, Huite Yi, Shulan Ruan et al.

The ongoing global pandemic of Coronavirus Disease 2019 (COVID-19) poses a serious threat to public health and the economy. Rapid and accurate diagnosis of COVID-19 is crucial to prevent the further spread of the disease and reduce its mortality. Chest Computed tomography (CT) is an effective tool for the early diagnosis of lung diseases including pneumonia. However, detecting COVID-19 from CT is demanding and prone to human errors as some early-stage patients may have negative findings on images. Recently, many deep learning methods have achieved impressive performance in this regard. Despite their effectiveness, most of these methods underestimate the rich spatial information preserved in the 3D structure or suffer from the propagation of errors. To address this problem, we propose a Dual-Attention Residual Network (DARNet) to automatically identify COVID-19 from other common pneumonia (CP) and healthy people using 3D chest CT images. Specifically, we design a dual-attention module consisting of channel-wise attention and depth-wise attention mechanisms. The former is utilized to enhance channel independence, while the latter is developed to recalibrate the depth-level features. Then, we integrate them in a unified manner to extract and refine the features at different levels to further improve the diagnostic performance. We evaluate DARNet on a large public CT dataset and obtain superior performance. Besides, the ablation study and visualization analysis prove the effectiveness and interpretability of the proposed method.

Xiaoyu Hao

2 Papers