Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation
This work solves segmentation challenges for applications like medical imaging and autonomous vehicles, but it appears incremental as it builds on existing CNN and active contour methods.
The paper tackled the problem of automated image segmentation by addressing limitations of CNN-based models, such as poor boundary capture and generalization, and developed novel methodologies including a unified CNN and active contour framework, achieving state-of-the-art results in 2D and 3D segmentation for medical imaging and computer vision.
Image segmentation is a fundamental and challenging problem in computer vision with applications spanning multiple areas, such as medical imaging, remote sensing, and autonomous vehicles. Recently, convolutional neural networks (CNNs) have gained traction in the design of automated segmentation pipelines. Although CNN-based models are adept at learning abstract features from raw image data, their performance is dependent on the availability and size of suitable training datasets. Additionally, these models are often unable to capture the details of object boundaries and generalize poorly to unseen classes. In this thesis, we devise novel methodologies that address these issues and establish robust representation learning frameworks for fully-automatic semantic segmentation in medical imaging and mainstream computer vision. In particular, our contributions include (1) state-of-the-art 2D and 3D image segmentation networks for computer vision and medical image analysis, (2) an end-to-end trainable image segmentation framework that unifies CNNs and active contour models with learnable parameters for fast and robust object delineation, (3) a novel approach for disentangling edge and texture processing in segmentation networks, and (4) a novel few-shot learning model in both supervised settings and semi-supervised settings where synergies between latent and image spaces are leveraged to learn to segment images given limited training data.