CVJul 4, 2022

Real Time Egocentric Segmentation for Video-self Avatar in Mixed Reality

arXiv:2207.01296v110 citationsh-index: 10
Originality Synthesis-oriented
AI Analysis

This enables users to see their own bodies in mixed reality, but it is incremental as it builds on existing architectures and datasets.

The authors tackled real-time egocentric body segmentation for mixed reality by developing a shallow network that achieves 66 fps at 640x480 resolution, using a new dataset of nearly 10,000 images from synthetic and real sources.

In this work we present our real-time egocentric body segmentation algorithm. Our algorithm achieves a frame rate of 66 fps for an input resolution of 640x480, thanks to our shallow network inspired in Thundernet's architecture. Besides, we put a strong emphasis on the variability of the training data. More concretely, we describe the creation process of our Egocentric Bodies (EgoBodies) dataset, composed of almost 10,000 images from three datasets, created both from synthetic methods and real capturing. We conduct experiments to understand the contribution of the individual datasets; compare Thundernet model trained with EgoBodies with simpler and more complex previous approaches and discuss their corresponding performance in a real-life setup in terms of segmentation quality and inference times. The described trained semantic segmentation algorithm is already integrated in an end-to-end system for Mixed Reality (MR), making it possible for users to see his/her own body while being immersed in a MR scene.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes