CVMar 30, 2015

Globally Tuned Cascade Pose Regression via Back Propagation with Application in 2D Face Pose Estimation and Heart Segmentation in 3D CT Images

arXiv:1503.08843v111 citations
Originality Incremental advance
AI Analysis

This work addresses pose estimation and segmentation tasks in computer vision and medical imaging, offering an incremental improvement by globally tuning an existing method.

The authors tackled the problem of improving Cascade Pose Regression (CPR) by representing it as a neural network using Graph Transformer Networks and training it globally with backpropagation, resulting in empirical performance gains over layer-wise training. They applied this method to 2D face pose estimation and extended it to 3D heart segmentation in CT images, demonstrating its effectiveness.

Recently, a successful pose estimation algorithm, called Cascade Pose Regression (CPR), was proposed in the literature. Trained over Pose Index Feature, CPR is a regressor ensemble that is similar to Boosting. In this paper we show how CPR can be represented as a Neural Network. Specifically, we adopt a Graph Transformer Network (GTN) representation and accordingly train CPR with Back Propagation (BP) that permits globally tuning. In contrast, previous CPR literature only took a layer wise training without any post fine tuning. We empirically show that global training with BP outperforms layer-wise (pre-)training. Our CPR-GTN adopts a Multi Layer Percetron as the regressor, which utilized sparse connection to learn local image feature representation. We tested the proposed CPR-GTN on 2D face pose estimation problem as in previous CPR literature. Besides, we also investigated the possibility of extending CPR-GTN to 3D pose estimation by doing experiments using 3D Computed Tomography dataset for heart segmentation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes