Annette Stahl

h-index12

7papers

19citations

Novelty45%

AI Score52

Ranked #34,855 of 205,806 authors (top 17%)#13,842 in CV (top 23%)

7 Papers

82.4ROMay 28Code

Caspar: CUDA Accelerator for Symbolic Programming with Adaptive Reordering

Emil Martens, Aaron Miller, Matias Varnum et al.

We present Caspar, a library that makes the power of modern GPUs more accessible in robotics and provides a state-of-the-art nonlinear GPU solver that can be applied to a wide range of different optimization problems. Caspar bridges the gap between expressive symbolic programming in Python and high-performance GPU runtimes in C++ by automatically generating optimized CUDA kernels from symbolic expressions. Building on the SymForce library, users can easily define and combine symbolic expressions, including Lie group operations, to generate custom CUDA kernels. To use Caspar as a solver, users need only define the symbolic residual functions; Caspar then uses symbolic differentiation to generate the necessary GPU kernels and interfaces to perform nonlinear optimization. In this paper, we present the core components of Caspar and showcase its performance by performing bundle adjustment on the Bundle Adjustment in the Large (BAL) dataset. We benchmark Caspar against other state-of-the-art bundle adjusters and show that it is 5 to 20 times faster than the best alternative, requires less memory, and achieves similar accuracy. This illustrates the benefit of our symbolic GPU programming approach. Caspar is released as part of SymForce and is freely available at https://github.com/symforce-org/symforce

CVJul 7, 2023Code

RGB-D Mapping and Tracking in a Plenoxel Radiance Field

Andreas L. Teigen, Yeonsoo Park, Annette Stahl et al.

The widespread adoption of Neural Radiance Fields (NeRFs) have ensured significant advances in the domain of novel view synthesis in recent years. These models capture a volumetric radiance field of a scene, creating highly convincing, dense, photorealistic models through the use of simple, differentiable rendering equations. Despite their popularity, these algorithms suffer from severe ambiguities in visual data inherent to the RGB sensor, which means that although images generated with view synthesis can visually appear very believable, the underlying 3D model will often be wrong. This considerably limits the usefulness of these models in practical applications like Robotics and Extended Reality (XR), where an accurate dense 3D reconstruction otherwise would be of significant value. In this paper, we present the vital differences between view synthesis models and 3D reconstruction models. We also comment on why a depth sensor is essential for modeling accurate geometry in general outward-facing scenes using the current paradigm of novel view synthesis methods. Focusing on the structure-from-motion task, we practically demonstrate this need by extending the Plenoxel radiance field model: Presenting an analytical differential approach for dense mapping and tracking with radiance fields based on RGB-D data without a neural network. Our method achieves state-of-the-art results in both mapping and tracking tasks, while also being faster than competing neural network-based approaches. The code is available at: https://github.com/ysus33/RGB-D_Plenoxel_Mapping_Tracking.git

13.1CVMay 18Code

Patch Ensembles for Robust Salmon Re-Identification with Weak Trajectory Labels

Espen Uri Høgstedt, Christian Schellewald, Annette Stahl et al.

Salmon re-identification in commercial net-pens is challenging due to large populations, which impose strict accuracy requirements and make large-scale labeled data acquisition infeasible. Trajectory IDs can be used as proxy labels, but this introduces trajectory-ID bias. To address these challenges, we propose a patch-based re-identification framework that fuses patch-level predictions into a salmon identity decision. A key component is the prediction of the salmon's lateral line, enabling extraction of texture-anchored patches and patch slices. To enable realistic evaluation, we introduce an experimental setup using multiple cameras placed 6 m apart, allowing the same fish to be recorded in different trajectories. This enables the construction of a cross-camera test set through manual match confirmation. Our ensemble approach outperforms the full-image baseline in same-trajectory validation (0.932 to 0.965 mAP) and cross-camera testing (0.609 to 0.860 mAP). The substantial improvements in the cross-camera setting demonstrate improved generalizability and robustness. Code and data: https://github.com/espenbh/salmon-reid-patch-ensemble.

CVNov 17, 2023

Removing Adverse Volumetric Effects From Trained Neural Radiance Fields

Andreas L. Teigen, Mauhing Yip, Victor P. Hamran et al.

While the use of neural radiance fields (NeRFs) in different challenging settings has been explored, only very recently have there been any contributions that focus on the use of NeRF in foggy environments. We argue that the traditional NeRF models are able to replicate scenes filled with fog and propose a method to remove the fog when synthesizing novel views. By calculating the global contrast of a scene, we can estimate a density threshold that, when applied, removes all visible fog. This makes it possible to use NeRF as a way of rendering clear views of objects of interest located in fog-filled environments. Additionally, to benchmark performance on such scenes, we introduce a new dataset that expands some of the original synthetic NeRF scenes through the addition of fog and natural environments. The code, dataset, and video results can be found on our project page: https://vegardskui.com/fognerf/

CVSep 30, 2025Code

A Multi-purpose Tracking Framework for Salmon Welfare Monitoring in Challenging Environments

Espen Uri Høgstedt, Christian Schellewald, Annette Stahl et al.

Computer Vision (CV)-based continuous, automated and precise salmon welfare monitoring is a key step toward reduced salmon mortality and improved salmon welfare in industrial aquaculture net pens. Available CV methods for determining welfare indicators focus on single indicators and rely on object detectors and trackers from other application areas to aid their welfare indicator calculation algorithm. This comes with a high resource demand for real-world applications, since each indicator must be calculated separately. In addition, the methods are vulnerable to difficulties in underwater salmon scenes, such as object occlusion, similar object appearance, and similar object motion. To address these challenges, we propose a flexible tracking framework that uses a pose estimation network to extract bounding boxes around salmon and their corresponding body parts, and exploits information about the body parts, through specialized modules, to tackle challenges specific to underwater salmon scenes. Subsequently, the high-detail body part tracks are employed to calculate welfare indicators. We construct two novel datasets assessing two salmon tracking challenges: salmon ID transfers in crowded scenes and salmon ID switches during turning. Our method outperforms the current state-of-the-art pedestrian tracker, BoostTrack, for both salmon tracking challenges. Additionally, we create a dataset for calculating salmon tail beat wavelength, demonstrating that our body part tracking method is well-suited for automated welfare monitoring based on tail beat analysis. Datasets and code are available at https://github.com/espenbh/BoostCompTrack.

CVFeb 25, 2025

Near-Shore Mapping for Detection and Tracking of Vessels

Nicholas Dalhaug, Annette Stahl, Rudolf Mester et al.

For an autonomous surface vessel (ASV) to dock, it must track other vessels close to the docking area. Kayaks present a particular challenge due to their proximity to the dock and relatively small size. Maritime target tracking has typically employed land masking to filter out land and the dock. However, imprecise land masking makes it difficult to track close-to-dock objects. Our approach uses Light Detection And Ranging (LiDAR) data and maps the docking area offline. The precise 3D measurements allow for precise map creation. However, the mapping could result in static, yet potentially moving, objects being mapped. We detect and filter out potentially moving objects from the LiDAR data by utilizing image data. The visual vessel detection and segmentation method is a neural network that is trained on our labeled data. Close-to-shore tracking improves with an accurate map and is demonstrated on a recently gathered real-world dataset. The dataset contains multiple sequences of a kayak and a day cruiser moving close to the dock, in a collision path with an autonomous ferry prototype.

CVFeb 15

Differential pose optimization in descriptor space -- Combining Geometric and Photometric Methods for Motion Estimation

Andreas L. Teigen, Annette Stahl, Rudolf Mester

One of the fundamental problems in computer vision is the two-frame relative pose optimization problem. Primarily, two different kinds of error values are used: photometric error and re-projection error. The selection of error value is usually directly dependent on the selection of feature paradigm, photometric features, or geometric features. It is a trade-off between accuracy, robustness, and the possibility of loop closing. We investigate a third method that combines the strengths of both paradigms into a unified approach. Using densely sampled geometric feature descriptors, we replace the photometric error with a descriptor residual from a dense set of descriptors, thereby enabling the employment of sub-pixel accuracy in differential photometric methods, along with the expressiveness of the geometric feature descriptor. Experiments show that although the proposed strategy is an interesting approach that results in accurate tracking, it ultimately does not outperform pose optimization strategies based on re-projection error despite utilizing more information. We proceed to analyze the underlying reason for this discrepancy and present the hypothesis that the descriptor similarity metric is too slowly varying and does not necessarily correspond strictly to keypoint placement accuracy.