Kasra Khosoussi

RO
h-index15
13papers
255citations
Novelty56%
AI Score41

13 Papers

CVDec 2, 2025Code
TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction

Fengyi Zhang, Tianjun Zhang, Kasra Khosoussi et al.

3D vision foundation models have shown strong generalization in reconstructing key 3D attributes from uncalibrated images through a single feed-forward pass. However, when deployed in online settings such as driving scenarios, predictions are made over temporal windows, making it non-trivial to maintain consistency across time. Recent strategies align consecutive predictions by solving global transformation, yet our analysis reveals their fundamental limitations in assumption validity, local alignment scope, and robustness under noisy geometry. In this work, we propose a higher-DOF and long-term alignment framework based on Thin Plate Spline, leveraging globally propagated control points to correct spatially varying inconsistencies. In addition, we adopt a point-agnostic submap registration design that is inherently robust to noisy geometry predictions. The proposed framework is fully plug-and-play, compatible with diverse 3D foundation models and camera configurations (e.g., monocular or surround-view). Extensive experiments demonstrate that our method consistently yields more coherent geometry and lower trajectory errors across multiple datasets, backbone models, and camera setups, highlighting its robustness and generality. Codes are publicly available at \href{https://github.com/Xian-Bei/TALO}{https://github.com/Xian-Bei/TALO}.

ROOct 2, 2021
Incremental Non-Gaussian Inference for SLAM Using Normalizing Flows

Qiangqiang Huang, Can Pu, Kasra Khosoussi et al.

This paper presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables efficient incremental updates similar to iSAM2, albeit in the more challenging non-Gaussian setting. We demonstrate the advantages of NF-iSAM over state-of-the-art point and distribution estimation algorithms using range-only SLAM problems with data association ambiguity. NF-iSAM presents superior accuracy in describing the posterior beliefs of continuous variables (e.g., position) and discrete variables (e.g., data association).

ROMay 11, 2021
NF-iSAM: Incremental Smoothing and Mapping via Normalizing Flows

Qiangqiang Huang, Can Pu, Dehann Fourie et al.

This paper presents a novel non-Gaussian inference algorithm, Normalizing Flow iSAM (NF-iSAM), for solving SLAM problems with non-Gaussian factors and/or non-linear measurement models. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to draw samples from the joint posterior of non-Gaussian factor graphs. By leveraging the Bayes tree, NF-iSAM is able to exploit the sparsity structure of SLAM, thus enabling efficient incremental updates similar to iSAM2, albeit in the more challenging non-Gaussian setting. We demonstrate the performance of NF-iSAM and compare it against the state-of-the-art algorithms such as iSAM2 (Gaussian) and mm-iSAM (non-Gaussian) in synthetic and real range-only SLAM datasets.

ROMar 27, 2021
Multi-Robot Distributed Semantic Mapping in Unfamiliar Environments through Online Matching of Learned Representations

Stewart Jamieson, Kaveh Fathian, Kasra Khosoussi et al.

We present a solution to multi-robot distributed semantic mapping of novel and unfamiliar environments. Most state-of-the-art semantic mapping systems are based on supervised learning algorithms that cannot classify novel observations online. While unsupervised learning algorithms can invent labels for novel observations, approaches to detect when multiple robots have independently developed their own labels for the same new class are prone to erroneous or inconsistent matches. These issues worsen as the number of robots in the system increases and prevent fusing the local maps produced by each robot into a consistent global map, which is crucial for cooperative planning and joint mission summarization. Our proposed solution overcomes these obstacles by having each robot learn an unsupervised semantic scene model online and use a multiway matching algorithm to identify consistent sets of matches between learned semantic labels belonging to different robots. Compared to the state of the art, the proposed solution produces 20-60% higher quality global maps that do not degrade even as many more local maps are fused.

ROJan 26, 2021
Non-Monotone Energy-Aware Information Gathering for Heterogeneous Robot Teams

Xiaoyi Cai, Brent Schlotfeldt, Kasra Khosoussi et al.

This paper considers the problem of planning trajectories for a team of sensor-equipped robots to reduce uncertainty about a dynamical process. Optimizing the trade-off between information gain and energy cost (e.g., control effort, distance travelled) is desirable but leads to a non-monotone objective function in the set of robot trajectories. Therefore, common multi-robot planning algorithms based on techniques such as coordinate descent lose their performance guarantees. Methods based on local search provide performance guarantees for optimizing a non-monotone submodular function, but require access to all robots' trajectories, making it not suitable for distributed execution. This work proposes a distributed planning approach based on local search and shows how lazy/greedy methods can be adopted to reduce the computation and communication of the approach. We demonstrate the efficacy of the proposed method by coordinating robot teams composed of both ground and aerial vehicles with different sensing/control profiles and evaluate the algorithm's performance in two target tracking scenarios. Compared to the naive distributed execution of local search, our approach saves up to 60% communication and 80--92% computation on average when coordinating up to 10 robots, while outperforming the coordinate descent based algorithm in achieving a desirable trade-off between sensing and energy cost.

OCNov 9, 2019
Distributed Certifiably Correct Pose-Graph Optimization

Yulun Tian, Kasra Khosoussi, David M. Rosen et al.

This paper presents the first certifiably correct algorithm for distributed pose-graph optimization (PGO), the backbone of modern collaborative simultaneous localization and mapping (CSLAM) and camera network localization (CNL) systems. Our method is based upon a sparse semidefinite relaxation that we prove provides globally-optimal PGO solutions under moderate measurement noise (matching the guarantees enjoyed by state-of-the-art centralized methods), but is amenable to distributed optimization using the low-rank Riemannian Staircase framework. To implement the Riemannian Staircase in the distributed setting, we develop Riemannian block coordinate descent (RBCD), a novel method for (locally) minimizing a function over a product of Riemannian manifolds. We also propose the first distributed solution verification and saddle escape methods to certify the global optimality of critical points recovered via RBCD, and to descend from suboptimal critical points (if necessary). All components of our approach are inherently decentralized: they require only local communication, provide privacy protection, and are easily parallelizable. Extensive evaluations on synthetic and real-world datasets demonstrate that the proposed method correctly recovers globally optimal solutions under moderate noise, and outperforms alternative distributed techniques in terms of solution precision and convergence speed.

ROJul 10, 2019
A Resource-Aware Approach to Collaborative Loop Closure Detection with Provable Performance Guarantees

Yulun Tian, Kasra Khosoussi, Jonathan P. How

This paper presents resource-aware algorithms for distributed inter-robot loop closure detection for applications such as collaborative simultaneous localization and mapping (CSLAM) and distributed image retrieval. In real-world scenarios, this process is resource-intensive as it involves exchanging many observations and geometrically verifying a large number of potential matches. This poses severe challenges for small-size and low-cost robots with various operational and resource constraints that limit, e.g., energy consumption, communication bandwidth, and computation capacity. This paper proposes a framework in which robots first exchange compact queries to identify a set of potential loop closures. We then seek to select a subset of potential inter-robot loop closures for geometric verification that maximizes a monotone submodular performance metric without exceeding budgets on computation (number of geometric verifications) and communication (amount of exchanged data for geometric verification). We demonstrate that this problem is in general NP-hard, and present efficient approximation algorithms with provable performance guarantees. The proposed framework is extensively evaluated on real and synthetic datasets. A natural convex relaxation scheme is also presented to certify the near-optimal performance of the proposed framework in practice.

OCMar 2, 2019
Block-Coordinate Minimization for Large SDPs with Block-Diagonal Constraints

Yulun Tian, Kasra Khosoussi, Jonathan P. How

The so-called Burer-Monteiro method is a well-studied technique for solving large-scale semidefinite programs (SDPs) via low-rank factorization. The main idea is to solve rank-restricted, albeit non-convex, surrogates instead of the SDP. Recent works have shown that, in an important class of SDPs with elegant geometric structure, one can find globally optimal solutions to the SDP by finding rank-deficient second-order critical points of an unconstrained Riemannian optimization problem. Hence, in such problems, the Burer-Monteiro approach can provide a scalable and reliable alternative to interior-point methods that scale poorly. Among various Riemannian optimization methods proposed, block-coordinate minimization (BCM) is of particular interest due to its simplicity. Erdogdu et al. in their recent work proposed BCM for problems over the Cartesian product of unit spheres and provided global convergence rate estimates for the algorithm. This report extends the BCM algorithm and the global convergence rate analysis of Erdogdu et al. from problems over the Cartesian product of unit spheres to the Cartesian product of Stiefel manifolds. The latter more general setting has important applications such as synchronization over the special orthogonal (SO) and special Euclidean (SE) groups.

ROFeb 6, 2019
CLEAR: A Consistent Lifting, Embedding, and Alignment Rectification Algorithm for Multi-View Data Association

Kaveh Fathian, Kasra Khosoussi, Yulun Tian et al.

Many robotics applications require alignment and fusion of observations obtained at multiple views to form a global model of the environment. Multi-way data association methods provide a mechanism to improve alignment accuracy of pairwise associations and ensure their consistency. However, existing methods that solve this computationally challenging problem are often too slow for real-time applications. Furthermore, some of the existing techniques can violate the cycle consistency principle, thus drastically reducing the fusion accuracy. This work presents the CLEAR (Consistent Lifting, Embedding, and Alignment Rectification) algorithm to address these issues. By leveraging insights from the multi-way matching and spectral graph clustering literature, CLEAR provides cycle consistent and accurate solutions in a computationally efficient manner. Numerical experiments on both synthetic and real datasets are carried out to demonstrate the scalability and superior performance of our algorithm in real-world problems. This algorithmic framework can provide significant improvement in the accuracy and efficiency of existing discrete assignment problems, which traditionally use pairwise (but potentially inconsistent) correspondences. An implementation of CLEAR is made publicly available online.

ROJan 17, 2019
Resource-Aware Algorithms for Distributed Loop Closure Detection with Provable Performance Guarantees

Yulun Tian, Kasra Khosoussi, Jonathan P. How

Inter-robot loop closure detection, e.g., for collaborative simultaneous localization and mapping (CSLAM), is a fundamental capability for many multirobot applications in GPS-denied regimes. In real-world scenarios, this is a resource-intensive process that involves exchanging observations and verifying potential matches. This poses severe challenges especially for small-size and low-cost robots with various operational and resource constraints that limit, e.g., energy consumption, communication bandwidth, and computation capacity. This paper presents resource-aware algorithms for distributed inter-robot loop closure detection. In particular, we seek to select a subset of potential inter-robot loop closures that maximizes a monotone submodular performance metric without exceeding computation and communication budgets. We demonstrate that this problem is in general NP-hard, and present efficient approximation algorithms with provable performance guarantees. A convex relaxation scheme is used to certify near-optimal performance of the proposed framework in real and synthetic SLAM benchmarks.

ROJun 1, 2018
Near-Optimal Budgeted Data Exchange for Distributed Loop Closure Detection

Yulun Tian, Kasra Khosoussi, Matthew Giamou et al.

Inter-robot loop closure detection is a core problem in collaborative SLAM (CSLAM). Establishing inter-robot loop closures is a resource-demanding process, during which robots must consume a substantial amount of mission-critical resources (e.g., battery and bandwidth) to exchange sensory data. However, even with the most resource-efficient techniques, the resources available onboard may be insufficient for verifying every potential loop closure. This work addresses this critical challenge by proposing a resource-adaptive framework for distributed loop closure detection. We seek to maximize task-oriented objectives subject to a budget constraint on total data transmission. This problem is in general NP-hard. We approach this problem from different perspectives and leverage existing results on monotone submodular maximization to provide efficient approximation algorithms with performance guarantees. The proposed approach is extensively evaluated using the KITTI odometry benchmark dataset and synthetic Manhattan-like datasets.

ROSep 19, 2017
Talk Resource-Efficiently to Me: Optimal Communication Planning for Distributed Loop Closure Detection

Matthew Giamou, Kasra Khosoussi, Jonathan P. How

Due to the distributed nature of cooperative simultaneous localization and mapping (CSLAM), detecting inter-robot loop closures necessitates sharing sensory data with other robots. A naïve approach to data sharing can easily lead to a waste of mission-critical resources. This paper investigates the logistical aspects of CSLAM. Particularly, we present a general resource-efficient communication planning framework that takes into account both the total amount of exchanged data and the induced division of labor between the participating robots. Compared to other state-of-the-art approaches, our framework is able to verify the same set of potential inter-robot loop closures while exchanging considerably less data and influencing the induced workloads. We develop a fast algorithm for finding globally optimal communication policies, and present theoretical analysis to characterize the necessary and sufficient conditions under which simpler strategies are optimal. The proposed framework is extensively evaluated with data from the KITTI odometry benchmark datasets.

RONov 3, 2016
Designing Sparse Reliable Pose-Graph SLAM: A Graph-Theoretic Approach

Kasra Khosoussi, Gaurav S. Sukhatme, Shoudong Huang et al.

In this paper, we aim to design sparse D-optimal (determinantoptimal) pose-graph SLAM problems through the synthesis of sparse graphs with the maximum weighted number of spanning trees. Characterizing graphs with the maximum number of spanning trees is an open problem in general. To tackle this problem, several new theoretical results are established in this paper, including the monotone log-submodularity of the weighted number of spanning trees. By exploiting these structures, we design a complementary pair of near-optimal efficient approximation algorithms with provable guarantees. Our theoretical results are validated using random graphs and a publicly available pose-graph SLAM dataset.