Mai Zhu

CV
3papers
95citations
Novelty53%
AI Score25

3 Papers

CVDec 4, 2020
Global Context Aware RCNN for Object Detection

Wenchao Zhang, Chong Fu, Haoyu Xie et al.

RoIPool/RoIAlign is an indispensable process for the typical two-stage object detection algorithm, it is used to rescale the object proposal cropped from the feature pyramid to generate a fixed size feature map. However, these cropped feature maps of local receptive fields will heavily lose global context information. To tackle this problem, we propose a novel end-to-end trainable framework, called Global Context Aware (GCA) RCNN, aiming at assisting the neural network in strengthening the spatial correlation between the background and the foreground by fusing global context information. The core component of our GCA framework is a context aware mechanism, in which both global feature pyramid and attention strategies are used for feature extraction and feature refinement, respectively. Specifically, we leverage the dense connection to improve the information flow of the global context at different stages in the top-down process of FPN, and further use the attention mechanism to refine the global context at each level in the feature pyramid. In the end, we also present a lightweight version of our method, which only slightly increases model complexity and computational burden. Experimental results on COCO benchmark dataset demonstrate the significant advantages of our approach.

CVAug 2, 2020
Joint Object Contour Points and Semantics for Instance Segmentation

Wenchao Zhang, Chong Fu, Mai Zhu

The attributes of object contours has great significance for instance segmentation task. However, most of the current popular deep neural networks do not pay much attention to the object edge information. Inspired by the human annotation process when making instance segmentation datasets, in this paper, we propose Mask Point R-CNN aiming at promoting the neural network's attention to the object boundary. Specifically, we innovatively extend the original human keypoint detection task to the contour point detection of any object. Based on this analogy, we present an contour point detection auxiliary task to Mask R-CNN, which can boost the gradient flow between different tasks by effectively using feature fusion strategies and multi-task joint training. As a consequence, the model will be more sensitive to the edges of the object and can capture more geometric features. Quantitatively, the experimental results show that our approach outperforms vanilla Mask R-CNN by 3.8\% on Cityscapes dataset and 0.8\% on COCO dataset.

CVFeb 24, 2018
Convolutional Neural Networks combined with Runge-Kutta Methods

Mai Zhu, Bo Chang, Chong Fu

A convolutional neural network can be constructed using numerical methods for solving dynamical systems, since the forward pass of the network can be regarded as a trajectory of a dynamical system. However, existing models based on numerical solvers cannot avoid the iterations of implicit methods, which makes the models inefficient at inference time. In this paper, we reinterpret the pre-activation Residual Networks (ResNets) and their variants from the dynamical systems view. We consider that the iterations of implicit Runge-Kutta methods are fused into the training of these models. Moreover, we propose a novel approach to constructing network models based on high-order Runge-Kutta methods in order to achieve higher efficiency. Our proposed models are referred to as the Runge-Kutta Convolutional Neural Networks (RKCNNs). The RKCNNs are evaluated on multiple benchmark datasets. The experimental results show that RKCNNs are vastly superior to other dynamical system network models: they achieve higher accuracy with much fewer resources. They also expand the family of network models based on numerical methods for dynamical systems.