SY CV LG ROSep 6, 2023

3D Object Positioning Using Differentiable Multimodal Learning

Sean Zanyk-McLean, Krishna Kumar, Paul Navratil

arXiv:2309.03177v11.2h-index: 3

Originality Incremental advance

AI Analysis

This incremental improvement addresses object positioning for autonomous vehicles by fusing sensor inputs to potentially locate multiple actors in scenes.

The paper tackles 3D object positioning by introducing a multimodal method that combines simulated Lidar data via ray tracing with image pixel loss using differentiable rendering to optimize object positions via gradient descent. The result shows that adding Lidar as a second modality leads to faster convergence compared to using image loss alone.

This article describes a multi-modal method using simulated Lidar data via ray tracing and image pixel loss with differentiable rendering to optimize an object's position with respect to an observer or some referential objects in a computer graphics scene. Object position optimization is completed using gradient descent with the loss function being influenced by both modalities. Typical object placement optimization is done using image pixel loss with differentiable rendering only, this work shows the use of a second modality (Lidar) leads to faster convergence. This method of fusing sensor input presents a potential usefulness for autonomous vehicles, as these methods can be used to establish the locations of multiple actors in a scene. This article also presents a method for the simulation of multiple types of data to be used in the training of autonomous vehicles.

View on arXiv PDF

Similar