CVMar 18, 2024

TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models

arXiv:2403.11691v15 citationsh-index: 103DV
Originality Incremental advance
AI Analysis

This addresses the challenge of distribution shifts in 3D semantic segmentation for applications like autonomous driving or robotics, though it is incremental as it extends test-time training to a new domain.

The paper tackles the problem of adapting 3D semantic segmentation models to changing data distributions at test-time by proposing TTT-KD, a method that uses knowledge distillation from foundation models as a self-supervised objective, resulting in gains of up to 13% mIoU for in-distribution and up to 45% mIoU for out-of-distribution test samples.

Test-Time Training (TTT) proposes to adapt a pre-trained network to changing data distributions on-the-fly. In this work, we propose the first TTT method for 3D semantic segmentation, TTT-KD, which models Knowledge Distillation (KD) from foundation models (e.g. DINOv2) as a self-supervised objective for adaptation to distribution shifts at test-time. Given access to paired image-pointcloud (2D-3D) data, we first optimize a 3D segmentation backbone for the main task of semantic segmentation using the pointclouds and the task of 2D $\to$ 3D KD by using an off-the-shelf 2D pre-trained foundation model. At test-time, our TTT-KD updates the 3D segmentation backbone for each test sample, by using the self-supervised task of knowledge distillation, before performing the final prediction. Extensive evaluations on multiple indoor and outdoor 3D segmentation benchmarks show the utility of TTT-KD, as it improves performance for both in-distribution (ID) and out-of-distribution (ODO) test datasets. We achieve a gain of up to 13% mIoU (7% on average) when the train and test distributions are similar and up to 45% (20% on average) when adapting to OOD test samples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes