CVMar 18, 2023
Supervision Interpolation via LossMix: Generalizing Mixup for Object Detection and BeyondThanh Vu, Baochen Sun, Bodi Yuan et al.
The success of data mixing augmentations in image classification tasks has been well-received. However, these techniques cannot be readily applied to object detection due to challenges such as spatial misalignment, foreground/background distinction, and plurality of instances. To tackle these issues, we first introduce a novel conceptual framework called Supervision Interpolation (SI), which offers a fresh perspective on interpolation-based augmentations by relaxing and generalizing Mixup. Based on SI, we propose LossMix, a simple yet versatile and effective regularization that enhances the performance and robustness of object detectors and more. Our key insight is that we can effectively regularize the training on mixed data by interpolating their loss errors instead of ground truth labels. Empirical results on the PASCAL VOC and MS COCO datasets demonstrate that LossMix can consistently outperform state-of-the-art methods widely adopted for detection. Furthermore, by jointly leveraging LossMix with unsupervised domain adaptation, we successfully improve existing approaches and set a new state of the art for cross-domain object detection.
CVAug 20, 2021
Pixel Contrastive-Consistent Semi-Supervised Semantic SegmentationYuanyi Zhong, Bodi Yuan, Hong Wu et al.
We present a novel semi-supervised semantic segmentation method which jointly achieves two desiderata of segmentation model regularities: the label-space consistency property between image augmentations and the feature-space contrastive property among different pixels. We leverage the pixel-level L2 loss and the pixel contrastive loss for the two purposes respectively. To address the computational efficiency issue and the false negative noise issue involved in the pixel contrastive loss, we further introduce and investigate several negative sampling techniques. Extensive experiments demonstrate the state-of-the-art performance of our method (PC2Seg) with the DeepLab-v3+ architecture, in several challenging semi-supervised settings derived from the VOC, Cityscapes, and COCO datasets.
CVDec 9, 2020
Semantically Robust Unpaired Image Translation for Data with Unmatched Semantics StatisticsZhiwei Jia, Bodi Yuan, Kangkang Wang et al.
Many applications of unpaired image-to-image translation require the input contents to be preserved semantically during translations. Unaware of the inherently unmatched semantics distributions between source and target domains, existing distribution matching methods (i.e., GAN-based) can give undesired solutions. In particular, although producing visually reasonable outputs, the learned models usually flip the semantics of the inputs. To tackle this without using extra supervision, we propose to enforce the translated outputs to be semantically invariant w.r.t. small perceptual variations of the inputs, a property we call "semantic robustness". By optimizing a robustness loss w.r.t. multi-scale feature space perturbations of the inputs, our method effectively reduces semantics flipping and produces translations that outperform existing methods both quantitatively and qualitatively.
LGApr 20, 2019
Model-free Deep Reinforcement Learning for Urban Autonomous DrivingJianyu Chen, Bodi Yuan, Masayoshi Tomizuka
Urban autonomous driving decision making is challenging due to complex road geometry and multi-agent interactions. Current decision making methods are mostly manually designing the driving policy, which might result in sub-optimal solutions and is expensive to develop, generalize and maintain at scale. On the other hand, with reinforcement learning (RL), a policy can be learned and improved automatically without any manual designs. However, current RL methods generally do not work well on complex urban scenarios. In this paper, we propose a framework to enable model-free deep reinforcement learning in challenging urban autonomous driving scenarios. We design a specific input representation and use visual encoding to capture the low-dimensional latent states. Several state-of-the-art model-free deep RL algorithms are implemented into our framework, with several tricks to improve their performance. We evaluate our method in a challenging roundabout task with dense surrounding vehicles in a high-definition driving simulator. The result shows that our method can solve the task well and is significantly better than the baseline.
ROMar 2, 2019
Deep Imitation Learning for Autonomous Driving in Generic Urban Scenarios with Enhanced SafetyJianyu Chen, Bodi Yuan, Masayoshi Tomizuka
The decision and planning system for autonomous driving in urban environments is hard to design. Most current methods manually design the driving policy, which can be expensive to develop and maintain at scale. Instead, with imitation learning we only need to collect data and the computer will learn and improve the driving policy automatically. However, existing imitation learning methods for autonomous driving are hardly performing well for complex urban scenarios. Moreover, the safety is not guaranteed when we use a deep neural network policy. In this paper, we proposed a framework to learn the driving policy in urban scenarios efficiently given offline connected driving data, with a safety controller incorporated to guarantee safety at test time. The experiments show that our method can achieve high performance in realistic simulations of urban driving scenarios.