SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360 degree Images
This addresses challenges in processing omni-directional images for computer vision applications, offering a method that can be integrated into existing CNN-based approaches, though it appears incremental as it builds on prior work in non-Euclidean representations.
The paper tackles the problem of applying CNNs to 360-degree images by proposing a spherical polyhedron representation to minimize distortion and loss of continuity, demonstrating feasibility through classification, detection, and semantic segmentation tasks on synthetic and real datasets.
Omni-directional cameras have many advantages overconventional cameras in that they have a much wider field-of-view (FOV). Accordingly, several approaches have beenproposed recently to apply convolutional neural networks(CNNs) to omni-directional images for various visual tasks.However, most of them use image representations defined inthe Euclidean space after transforming the omni-directionalviews originally formed in the non-Euclidean space. Thistransformation leads to shape distortion due to nonuniformspatial resolving power and the loss of continuity. Theseeffects make existing convolution kernels experience diffi-culties in extracting meaningful information.This paper presents a novel method to resolve such prob-lems of applying CNNs to omni-directional images. Theproposed method utilizes a spherical polyhedron to rep-resent omni-directional views. This method minimizes thevariance of the spatial resolving power on the sphere sur-face, and includes new convolution and pooling methodsfor the proposed representation. The proposed method canalso be adopted by any existing CNN-based methods. Thefeasibility of the proposed method is demonstrated throughclassification, detection, and semantic segmentation taskswith synthetic and real datasets.