nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric Vision
This addresses the need for camera-agnostic geometric vision in applications such as automotive and real estate capture, though it is incremental as it builds on existing PyTorch frameworks.
The authors tackled the problem of camera model dependency in deep learning algorithms by introducing nvTorchCam, an open-source library that abstracts camera operations to enable algorithm implementation once for diverse camera models like pinhole, fisheye, and 360 panoramas, resulting in direct transfer of trained models across camera types without modification.
We introduce nvTorchCam, an open-source library under the Apache 2.0 license, designed to make deep learning algorithms camera model-independent. nvTorchCam abstracts critical camera operations such as projection and unprojection, allowing developers to implement algorithms once and apply them across diverse camera models--including pinhole, fisheye, and 360 equirectangular panoramas, which are commonly used in automotive and real estate capture applications. Built on PyTorch, nvTorchCam is fully differentiable and supports GPU acceleration and batching for efficient computation. Furthermore, deep learning models trained for one camera type can be directly transferred to other camera types without requiring additional modification. In this paper, we provide an overview of nvTorchCam, its functionality, and present various code examples and diagrams to demonstrate its usage. Source code and installation instructions can be found on the nvTorchCam GitHub page at https://github.com/NVlabs/nvTorchCam.