Ivan Petrović

h-index32

26papers

301citations

Novelty47%

AI Score42

Ranked #60,083 of 194,257 authors (top 31%)#1,739 in RO (top 26%)

26 Papers

5.5DCApr 28

Performance and Energy Trade-Off Analysis of Hierarchical Federated Learning for Plant Disease Classification

Athanasios Papanikolaou, Athanasios Tziouvaras, Pavlos Stoikos et al.

Early detection of plant diseases is critical for improving crop productivity, while it also facilitates the foundations of precision agriculture. Recent advances in distributed deep learning have enabled plant disease classification models to be trained across geographically distributed agricultural sensing infrastructures. However, deploying such systems in large-scale Internet of Things (IoT) environments, introduces significant challenges related to computational cost, energy consumption, and system efficiency. In this paper, we present a design-space exploration of hierarchical federated learning architectures for plant disease classification, with a particular focus on the trade-offs between predictive performance and energy efficiency. We further introduce a power- and energy-aware optimization framework that enables the systematic evaluation and selection of model-aggregator configurations under varying deployment constraints. The hierarchical federated architecture organizes distributed clients through intermediate aggregation layers, reducing communication and computational overhead. We evaluate multiple convolutional neural network architectures, including EfficientNet-B0, ResNet-50, and MobileNetV3-Large, in combination with different federated aggregation strategies such as FedAvg, FedProx, and FedAvgM. Experimental results demonstrate that different model-aggregator combinations exhibit distinct performance-energy trade-offs. Consequently, we highlight configurations that achieve competitive diagnostic accuracy and significantly reduce system resource requirements.

5.0ROJan 5, 2023

A Distance-Geometric Method for Recovering Robot Joint Angles From an RGB Image

Ivan Bilić, Filip Marić, Ivan Marković et al.

Autonomous manipulation systems operating in domains where human intervention is difficult or impossible (e.g., underwater, extraterrestrial or hazardous environments) require a high degree of robustness to sensing and communication failures. Crucially, motion planning and control algorithms require a stream of accurate joint angle data provided by joint encoders, the failure of which may result in an unrecoverable loss of functionality. In this paper, we present a novel method for retrieving the joint angles of a robot manipulator using only a single RGB image of its current configuration, opening up an avenue for recovering system functionality when conventional proprioceptive sensing is unavailable. Our approach, based on a distance-geometric representation of the configuration space, exploits the knowledge of a robot's kinematic model with the goal of training a shallow neural network that performs a 2D-to-3D regression of distances associated with detected structural keypoints. It is shown that the resulting Euclidean distance matrix uniquely corresponds to the observed configuration, where joint angles can be recovered via multidimensional scaling and a simple inverse kinematics procedure. We evaluate the performance of our approach on real RGB images of a Franka Emika Panda manipulator, showing that the proposed method is efficient and exhibits solid generalization ability. Furthermore, we show that our method can be easily combined with a dense refinement technique to obtain superior results.

4.1ROMay 8, 2024

GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation

Ivan Bilić, Filip Marić, Fabio Bonsignorio et al.

In autonomous robotics, measurement of the robot's internal state and perception of its environment, including interaction with other agents such as collaborative robots, are essential. Estimating the pose of the robot arm from a single view has the potential to replace classical eye-to-hand calibration approaches and is particularly attractive for online estimation and dynamic environments. In addition to its pose, recovering the robot configuration provides a complete spatial understanding of the observed robot that can be used to anticipate the actions of other agents in advanced robotics use cases. Furthermore, this additional redundancy enables the planning and execution of recovery protocols in case of sensor failures or external disturbances. We introduce GISR - a deep configuration and robot-to-camera pose estimation method that prioritizes execution in real-time. GISR consists of two modules: (i) a geometric initialization module that efficiently computes an approximate robot pose and configuration, and (ii) a deep iterative silhouette-based refinement module that arrives at a final solution in just a few iterations. We evaluate GISR on publicly available data and show that it outperforms existing methods of the same class in terms of both speed and accuracy, and can compete with approaches that rely on ground-truth proprioception and recover only the pose.

3.6CVMay 6, 2025Code

An Active Inference Model of Covert and Overt Visual Attention

Tin Mišić, Karlo Koledić, Fabio Bonsignorio et al.

The ability to selectively attend to relevant stimuli while filtering out distractions is essential for agents that process complex, high-dimensional sensory input. This paper introduces a model of covert and overt visual attention through the framework of active inference, utilizing dynamic optimization of sensory precisions to minimize free-energy. The model determines visual sensory precisions based on both current environmental beliefs and sensory input, influencing attentional allocation in both covert and overt modalities. To test the effectiveness of the model, we analyze its behavior in the Posner cueing task and a simple target focus task using two-dimensional(2D) visual data. Reaction times are measured to investigate the interplay between exogenous and endogenous attention, as well as valid and invalid cueing. The results show that exogenous and valid cues generally lead to faster reaction times compared to endogenous and invalid cues. Furthermore, the model exhibits behavior similar to inhibition of return, where previously attended locations become suppressed after a specific cue-target onset asynchrony interval. Lastly, we investigate different aspects of overt attention and show that involuntary, reflexive saccades occur faster than intentional ones, but at the expense of adaptability.

5.2CVDec 8, 2024

GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion

Karlo Koledić, Luka Petrović, Ivan Marković et al.

Generalizing metric monocular depth estimation presents a significant challenge due to its ill-posed nature, while the entanglement between camera parameters and depth amplifies issues further, hindering multi-dataset training and zero-shot accuracy. This challenge is particularly evident in autonomous vehicles and mobile robotics, where data is collected with fixed camera setups, limiting the geometric diversity. Yet, this context also presents an opportunity: the fixed relationship between the camera and the ground plane imposes additional perspective geometry constraints, enabling depth regression via vertical image positions of objects. However, this cue is highly susceptible to overfitting, thus we propose a novel canonical representation that maintains consistency across varied camera setups, effectively disentangling depth from specific parameters and enhancing generalization across datasets. We also propose a novel architecture that adaptively and probabilistically fuses depths estimated via object size and vertical image position cues. A comprehensive evaluation demonstrates the effectiveness of the proposed approach on five autonomous driving datasets, achieving accurate metric depth estimation for varying resolutions, aspect ratios and camera setups. Notably, we achieve comparable accuracy to existing zero-shot methods, despite training on a single dataset with a single-camera setup. Project website: https://unizgfer-lamor.github.io/gvdepth/

3.9CVDec 10, 2023

GenDepth: Generalizing Monocular Depth Estimation for Arbitrary Camera Parameters via Ground Plane Embedding

Karlo Koledić, Luka Petrović, Ivan Petrović et al.

Learning-based monocular depth estimation leverages geometric priors present in the training data to enable metric depth perception from a single image, a traditionally ill-posed problem. However, these priors are often specific to a particular domain, leading to limited generalization performance on unseen data. Apart from the well studied environmental domain gap, monocular depth estimation is also sensitive to the domain gap induced by varying camera parameters, an aspect that is often overlooked in current state-of-the-art approaches. This issue is particularly evident in autonomous driving scenarios, where datasets are typically collected with a single vehicle-camera setup, leading to a bias in the training data due to a fixed perspective geometry. In this paper, we challenge this trend and introduce GenDepth, a novel model capable of performing metric depth estimation for arbitrary vehicle-camera setups. To address the lack of data with sufficiently diverse camera parameters, we first create a bespoke synthetic dataset collected with different vehicle-camera systems. Then, we design GenDepth to simultaneously optimize two objectives: (i) equivariance to the camera parameter variations on synthetic data, (ii) transferring the learned equivariance to real-world environmental features using a single real-world dataset with a fixed vehicle-camera system. To achieve this, we propose a novel embedding of camera parameters as the ground plane depth and present a novel architecture that integrates these embeddings with adversarial domain alignment. We validate GenDepth on several autonomous driving datasets, demonstrating its state-of-the-art generalization capability for different vehicle-camera systems.

14.7ROSep 8, 2021

Recalibrating the KITTI Dataset Camera Setup for Improved Odometry Accuracy

Igor Cvišić, Ivan Marković, Ivan Petrović

Over the last decade, one of the most relevant public datasets for evaluating odometry accuracy is the KITTI dataset. Beside the quality and rich sensor setup, its success is also due to the online evaluation tool, which enables researchers to benchmark and compare algorithms. The results are evaluated on the test subset solely, without any knowledge about the ground truth, yielding unbiased, overfit free and therefore relevant validation for robot localization based on cameras, 3D laser or combination of both. However, as any sensor setup, it requires prior calibration and rectified stereo images are provided, introducing dependence on the default calibration parameters. Given that, a natural question arises if a better set of calibration parameters can be found that would yield higher odometry accuracy. In this paper, we propose a new approach for one shot calibration of the KITTI dataset multiple camera setup. The approach yields better calibration parameters, both in the sense of lower calibration reprojection errors and lower visual odometry error. We conducted experiments where we show for three different odometry algorithms, namely SOFT2, ORB-SLAM2 and VISO2, that odometry accuracy is significantly improved with the proposed calibration parameters. Moreover, our odometry, SOFT2, in conjunction with the proposed calibration method achieved the highest accuracy on the official KITTI scoreboard with 0.53% translational and 0.0009 deg/m rotational error, outperforming even 3D laser-based methods.

10.4ROSep 8, 2021Code

Convex Iteration for Distance-Geometric Inverse Kinematics

Matthew Giamou, Filip Marić, David M. Rosen et al.

Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form expressions yielding feasible configurations do not exist, motivating the use of numerical solution methods. However, these approaches rely on local optimization of nonconvex problems, often requiring an accurate initialization or numerous re-initializations to converge to a valid solution. In this work, we first formulate inverse kinematics with complex workspace constraints as a convex feasibility problem whose low-rank feasible points provide exact IK solutions. We then present \texttt{CIDGIK} (Convex Iteration for Distance-Geometric Inverse Kinematics), an algorithm that solves this feasibility problem with a sequence of semidefinite programs whose objectives are designed to encourage low-rank minimizers. Our problem formulation elegantly unifies the configuration space and workspace constraints of a robot: intrinsic robot geometry and obstacle avoidance are both expressed as simple linear matrix equations and inequalities. Our experimental results for a variety of popular manipulator models demonstrate faster and more accurate convergence than a conventional nonlinear optimization-based approach, especially in environments with many obstacles.

12.8ROAug 31, 2021Code

Riemannian Optimization for Distance-Geometric Inverse Kinematics

Filip Marić, Matthew Giamou, Adam W. Hall et al.

Solving the inverse kinematics problem is a fundamental challenge in motion planning, control, and calibration for articulated robots. Kinematic models for these robots are typically parametrized by joint angles, generating a complicated mapping between the robot configuration and the end-effector pose. Alternatively, the kinematic model and task constraints can be represented using invariant distances between points attached to the robot. In this paper, we formalize the equivalence of distance-based inverse kinematics and the distance geometry problem for a large class of articulated robots and task constraints. Unlike previous approaches, we use the connection between distance geometry and low-rank matrix completion to find inverse kinematics solutions by completing a partial Euclidean distance matrix through local optimization. Furthermore, we parametrize the space of Euclidean distance matrices with the Riemannian manifold of fixed-rank Gram matrices, allowing us to leverage a variety of mature Riemannian optimization methods. Finally, we show that bound smoothing can be used to generate informed initializations without significant computational overhead, improving convergence. We demonstrate that our inverse kinematics solver achieves higher success rates than traditional techniques, and substantially outperforms them on problems that involve many workspace constraints.

16.4ROJul 10, 2021

Feature-based Event Stereo Visual Odometry

Antea Hadviger, Igor Cvišić, Ivan Marković et al.

Event-based cameras are biologically inspired sensors that output events, i.e., asynchronous pixel-wise brightness changes in the scene. Their high dynamic range and temporal resolution of a microsecond makes them more reliable than standard cameras in environments of challenging illumination and in high-speed scenarios, thus developing odometry algorithms based solely on event cameras offers exciting new possibilities for autonomous systems and robots. In this paper, we propose a novel stereo visual odometry method for event cameras based on feature detection and matching with careful feature management, while pose estimation is done by reprojection error minimization. We evaluate the performance of the proposed method on two publicly available datasets: MVSEC sequences captured by an indoor flying drone and DSEC outdoor driving sequences. MVSEC offers accurate ground truth from motion capture, while for DSEC, which does not offer ground truth, in order to obtain a reference trajectory on the standard camera frames we used our SOFT visual odometry, one of the highest ranking algorithms on the KITTI scoreboards. We compared our method to the ESVO method, which is the first and still the only stereo event odometry method, showing on par performance on the MVSEC sequences, while on the DSEC dataset ESVO, unlike our method, was unable to handle outdoor driving scenario with default parameters. Furthermore, two important advantages of our method over ESVO are that it adapts tracking frequency to the asynchronous event rate and does not require initialization.

12.8ROMar 12, 2021

A Continuous-Time Approach for 3D Radar-to-Camera Extrinsic Calibration

Emmett Wise, Juraj Peršić, Christopher Grebe et al.

Reliable operation in inclement weather is essential to the deployment of safe autonomous vehicles (AVs). Robustness and reliability can be achieved by fusing data from the standard AV sensor suite (i.e., lidars, cameras) with weather robust sensors, such as millimetre-wavelength radar. Critically, accurate sensor data fusion requires knowledge of the rigid-body transform between sensor pairs, which can be determined through the process of extrinsic calibration. A number of extrinsic calibration algorithms have been designed for 2D (planar) radar sensors - however, recently-developed, low-cost 3D millimetre-wavelength radars are set to displace their 2D counterparts in many applications. In this paper, we present a continuous-time 3D radar-to-camera extrinsic calibration algorithm that utilizes radar velocity measurements and, unlike the majority of existing techniques, does not require specialized radar retroreflectors to be present in the environment. We derive the observability properties of our formulation and demonstrate the efficacy of our algorithm through synthetic and real-world experiments.

8.9ROMar 9, 2021

A Riemannian Metric for Geometry-Aware Singularity Avoidance by Articulated Robots

Filip Marić, Luka Petrović, Marko Guberina et al.

Articulated robots such as manipulators increasingly must operate in uncertain and dynamic environments where interaction (with human coworkers, for example) is necessary. In these situations, the capacity to quickly adapt to unexpected changes in operational space constraints is essential. At certain points in a manipulator's configuration space, termed singularities, the robot loses one or more degrees of freedom (DoF) and is unable to move in specific operational space directions. The inability to move in arbitrary directions in operational space compromises adaptivity and, potentially, safety. We introduce a geometry-aware singularity index, defined using a Riemannian metric on the manifold of symmetric positive definite matrices, to provide a measure of proximity to singular configurations. We demonstrate that our index avoids some of the failure modes and difficulties inherent to other common indices. Further, we show that this index can be differentiated easily, making it compatible with local optimization approaches used for operational space control. Our experimental results establish that, for reaching and path following tasks, optimization based on our index outperforms a common manipulability maximization technique and ensures singularity-robust motions.

3.0ROJan 14, 2021

Ensemble of LSTMs and feature selection for human action prediction

Tomislav Petković, Luka Petrović, Ivan Marković et al.

As robots are becoming more and more ubiquitous in human environments, it will be necessary for robotic systems to better understand and predict human actions. However, this is not an easy task, at times not even for us humans, but based on a relatively structured set of possible actions, appropriate cues, and the right model, this problem can be computationally tackled. In this paper, we propose to use an ensemble of long-short term memory (LSTM) networks for human action prediction. To train and evaluate models, we used the MoGaze dataset - currently the most comprehensive dataset capturing poses of human joints and the human gaze. We have thoroughly analyzed the MoGaze dataset and selected a reduced set of cues for this task. Our model can predict (i) which of the labeled objects the human is going to grasp, and (ii) which of the macro locations the human is going to visit (such as table or shelf). We have exhaustively evaluated the proposed method and compared it to individual cue baselines. The results suggest that our LSTM model slightly outperforms the gaze baseline in single object picking accuracy, but achieves better accuracy in macro object prediction. Furthermore, we have also analyzed the prediction accuracy when the gaze is not used, and in this case, the LSTM model considerably outperformed the best single cue baseline

2.2RONov 10, 2020

Inverse Kinematics as Low-Rank Euclidean Distance Matrix Completion

Filip Marić, Matthew Giamou, Ivan Petrović et al.

The majority of inverse kinematics (IK) algorithms search for solutions in a configuration space defined by joint angles. However, the kinematics of many robots can also be described in terms of distances between rigidly-attached points, which collectively form a Euclidean distance matrix. This alternative geometric description of the kinematics reveals an elegant equivalence between IK and the problem of low-rank matrix completion. We use this connection to implement a novel Riemannian optimization-based solution to IK for various articulated robots with symmetric joint angle constraints.

5.7ROMay 22, 2020

Human Intention Recognition for Human Aware Planning in Integrated Warehouse Systems

Tomislav Petković, Jakub Hvězda, Tomáš Rybecký et al.

With the substantial growth of logistics businesses the need for larger and more automated warehouses increases, thus giving rise to fully robotized shop-floors with mobile robots in charge of transporting and distributing goods. However, even in fully automatized warehouse systems the need for human intervention frequently arises, whether because of maintenance or because of fulfilling specific orders, thus bringing mobile robots and humans ever closer in an integrated warehouse environment. In order to ensure smooth and efficient operation of such a warehouse, paths of both robots and humans need to be carefully planned; however, due to the possibility of humans deviating from the assigned path, this becomes an even more challenging task. Given that, the supervising system should be able to recognize human intentions and its alternative paths in real-time. In this paper, we propose a framework for human deviation detection and intention recognition which outputs the most probable paths of the humans workers and the planner that acts accordingly by replanning for robots to move out of the human's path. Experimental results demonstrate that the proposed framework increases total number of deliveries, especially human deliveries, and reduces human-robot encounters.

6.2ROAug 8, 2019

Fast Manipulability Maximization Using Continuous-Time Trajectory Optimization

Filip Marić, Oliver Limoyo, Luka Petrović et al.

A significant challenge in manipulation motion planning is to ensure agility in the face of unpredictable changes during task execution. This requires the identification and possible modification of suitable joint-space trajectories, since the joint velocities required to achieve a specific endeffector motion vary with manipulator configuration. For a given manipulator configuration, the joint space-to-task space velocity mapping is characterized by a quantity known as the manipulability index. In contrast to previous control-based approaches, we examine the maximization of manipulability during planning as a way of achieving adaptable and safe joint space-to-task space motion mappings in various scenarios. By representing the manipulator trajectory as a continuous-time Gaussian process (GP), we are able to leverage recent advances in trajectory optimization to maximize the manipulability index during trajectory generation. Moreover, the sparsity of our chosen representation reduces the typically large computational cost associated with maximizing manipulability when additional constraints exist. Results from simulation studies and experiments with a real manipulator demonstrate increases in manipulability, while maintaining smooth trajectories with more dexterous (and therefore more agile) arm configurations.

3.5ROJul 17, 2019

Stereo Event Lifetime and Disparity Estimation for Dynamic Vision Sensors

Antea Hadviger, Ivan Marković, Ivan Petrović

Event-based cameras are biologically inspired sensors that output asynchronous pixel-wise brightness changes in the scene called events. They have a high dynamic range and temporal resolution of a microsecond, opposed to standard cameras that output frames at fixed frame rates and suffer from motion blur. Forming stereo pairs of such cameras can open novel application possibilities, since for each event depth can be readily estimated; however, to fully exploit asynchronous nature of the sensor and avoid fixed time interval event accumulation, stereo event lifetime estimation should be employed. In this paper, we propose a novel method for event lifetime estimation of stereo event-cameras, allowing generation of sharp gradient images of events that serve as input to disparity estimation methods. Since a single brightness change triggers events in both event-camera sensors, we propose a method for single shot event lifetime and disparity estimation, with association via stereo matching. The proposed method is approximately twice as fast and more accurate than if lifetimes were estimated separately for each sensor and then stereo matched. Results are validated on real-world data through multiple stereo event-camera experiments.

1.8CVJul 16, 2019

Pedestrian Tracking by Probabilistic Data Association and Correspondence Embeddings

Borna Bićanić, Marin Oršić, Ivan Marković et al.

This paper studies the interplay between kinematics (position and velocity) and appearance cues for establishing correspondences in multi-target pedestrian tracking. We investigate tracking-by-detection approaches based on a deep learning detector, joint integrated probabilistic data association (JIPDA), and appearance-based tracking of deep correspondence embeddings. We first addressed the fixed-camera setup by fine-tuning a convolutional detector for accurate pedestrian detection and combining it with kinematic-only JIPDA. The resulting submission ranked first on the 3DMOT2015 benchmark. However, in sequences with a moving camera and unknown ego-motion, we achieved the best results by replacing kinematic cues with global nearest neighbor tracking of deep correspondence embeddings. We trained the embeddings by fine-tuning features from the second block of ResNet-18 using angular loss extended by a margin term. We note that integrating deep correspondence embeddings directly in JIPDA did not bring significant improvement. It appears that geometry of deep correspondence embeddings for soft data association needs further investigation in order to obtain the best from both worlds.

7.3ROApr 8, 2019

Spatio-Temporal Multisensor Calibration Based on Gaussian Processes Moving Object Tracking

Juraj Peršić, Luka Petrović, Ivan Marković et al.

Perception is one of the key abilities of autonomous mobile robotic systems, which often relies on fusion of heterogeneous sensors. Although this heterogeneity presents a challenge for sensor calibration, it is also the main prospect for reliability and robustness of autonomous systems. In this paper, we propose a method for multisensor calibration based on Gaussian processes (GPs) estimated moving object trajectories, resulting with temporal and extrinsic parameters. The appealing properties of the proposed temporal calibration method are: coordinate frame invariance, thus avoiding prior extrinsic calibration, theoretically grounded batch state estimation and interpolation using GPs, computational efficiency with O(n) complexity, leveraging data already available in autonomous robot platforms, and the end result enabling 3D point-to-point extrinsic multisensor calibration. The proposed method is validated both in simulations and real-world experiments. For real-world experiment we evaluated the method on two multisensor systems: an externally triggered stereo camera, thus having temporal ground truth readily available, and a heterogeneous combination of a camera and motion capture system. The results show that the estimated time delays are accurate up to a fraction of the fastest sensor sampling time.

2.9ROSep 21, 2018

Computationally efficient dense moving object detection based on reduced space disparity estimation

Goran Popović, Antea Hadviger, Ivan Marković et al.

Computationally efficient moving object detection and depth estimation from a stereo camera is an extremely useful tool for many computer vision applications, including robotics and autonomous driving. In this paper we show how moving objects can be densely detected by estimating disparity using an algorithm that improves complexity and accuracy of stereo matching by relying on information from previous frames. The main idea behind this approach is that by using the ego-motion estimation and the disparity map of the previous frame, we can set a prior base that enables us to reduce the complexity of the current frame disparity estimation, subsequently also detecting moving objects in the scene. For each pixel we run a Kalman filter that recursively fuses the disparity prediction and reduced space semi-global matching (SGM) measurements. The proposed algorithm has been implemented and optimized using streaming single instruction multiple data instruction set and multi-threading. Furthermore, in order to estimate the process and measurement noise as reliably as possible, we conduct extensive experiments on the KITTI suite using the ground truth obtained by the 3D laser range sensor. Concerning disparity estimation, compared to the OpenCV SGM implementation, the proposed method yields improvement on the KITTI dataset sequences in terms of both speed and accuracy.

4.2ROApr 5, 2018

Human Intention Recognition in Flexible Robotized Warehouses based on Markov Decision Processes

Tomislav Petković, Ivan Marković, Ivan Petrović

The rapid growth of e-commerce increases the need for larger warehouses and their automation, thus using robots as assistants to human workers becomes a priority. In order to operate efficiently and safely, robot assistants or the supervising system should recognize human intentions. Theory of mind (ToM) is an intuitive conception of other agents' mental state, i.e., beliefs and desires, and how they cause behavior. In this paper we present a ToM-based algorithm for human intention recognition in flexible robotized warehouses. We have placed the warehouse worker in a simulated 2D environment with three potential goals. We observe agent's actions and validate them with respect to the goal locations using a Markov decision process framework. Those observations are then processed by the proposed hidden Markov model framework which estimated agent's desires. We demonstrate that the proposed framework predicts human warehouse worker's desires in an intuitive manner and in the end we discuss the simulation results.

5.3ROMar 26, 2018

Manipulability Maximization Using Continuous-Time Gaussian Processes

Filip Marić, Oliver Limoyo, Luka Petrović et al.

A significant challenge in motion planning is to avoid being in or near \emph{singular configurations} (\textit{singularities}), that is, joint configurations that result in the loss of the ability to move in certain directions in task space. A robotic system's capacity for motion is reduced even in regions that are in close proximity to (i.e., neighbouring) a singularity. In this work we examine singularity avoidance in a motion planning context, finding trajectories which minimize proximity to singular regions, subject to constraints. We define a manipulability-based likelihood associated with singularity avoidance over a continuous trajectory representation, which we then maximize using a \textit{maximum a posteriori} (MAP) estimator. Viewing the MAP problem as inference on a factor graph, we use gradient information from interpolated states to maximize the trajectory's overall manipulability. Both qualitative and quantitative analyses of experimental data show increases in manipulability that result in smooth trajectories with visibly more dexterous arm configurations.

3.2ROAug 21, 2017

Dense Disparity Estimation in Ego-motion Reduced Search Space

Luka Fućek, Ivan Marković, Igor Cvišić et al.

Depth estimation from stereo images remains a challenge even though studied for decades. The KITTI benchmark shows that the state-of-the-art solutions offer accurate depth estimation, but are still computationally complex and often require a GPU or FPGA implementation. In this paper we aim at increasing the accuracy of depth map estimation and reducing the computational complexity by using information from previous frames. We propose to transform the disparity map of the previous frame into the current frame, relying on the estimated ego-motion, and use this map as the prediction for the Kalman filter in the disparity space. Then, we update the predicted disparity map using the newly matched one. This way we reduce disparity search space and flickering between consecutive frames, thus increasing the computational efficiency of the algorithm. In the end, we validate the proposed approach on real-world data from the KITTI benchmark suite and show that the proposed algorithm yields more accurate results, while at the same time reducing the disparity search space.

4.5ROAug 18, 2017

On wrapping the Kalman filter and estimating with the SO(2) group

Ivan Markovic, Josip Cesic, Ivan Petrovic

This paper analyzes directional tracking in 2D with the extended Kalman filter on Lie groups (LG-EKF). The study stems from the problem of tracking objects moving in 2D Euclidean space, with the observer measuring direction only, thus rendering the measurement space and object position on the circle---a non-Euclidean geometry. The problem is further inconvenienced if we need to include higher-order dynamics in the state space, like angular velocity which is a Euclidean variables. The LG-EKF offers a solution to this issue by modeling the state space as a Lie group or combination thereof, e.g., SO(2) or its combinations with Rn. In the present paper, we first derive the LG-EKF on SO(2) and subsequently show that this derivation, based on the mathematically grounded framework of filtering on Lie groups, yields the same result as heuristically wrapping the angular variable within the EKF framework. This result applies only to the SO(2) and SO(2)xRn LG-EKFs and is not intended to be extended to other Lie groups or combinations thereof. In the end, we showcase the SO(2)xR2 LG-EKF, as an example of a constant angular acceleration model, on the problem of speaker tracking with a microphone array for which real-world experiments are conducted and accuracy is evaluated with ground truth data obtained by a motion capture system.

4.5ROAug 18, 2017

Moving object tracking employing rigid body motion on matrix Lie groups

Josip Cesic, Ivan Markovic, Ivan Petrovic

In this paper we propose a novel method for estimating rigid body motion by modeling the object state directly in the space of the rigid body motion group SE(2). It has been recently observed that a noisy manoeuvring object in SE(2) exhibits banana-shaped probability density contours in its pose. For this reason, we propose and investigate two state space models for moving object tracking: (i) a direct product SE(2)xR3 and (ii) a direct product of the two rigid body motion groups SE(2)xSE(2). The first term within these two state space constructions describes the current pose of the rigid body, while the second one employs its second order dynamics, i.e., the velocities. By this, we gain the flexibility of tracking omnidirectional motion in the vein of a constant velocity model, but also accounting for the dynamics in the rotation component. Since the SE(2) group is a matrix Lie group, we solve this problem by using the extended Kalman filter on matrix Lie groups and provide a detailed derivation of the proposed filters. We analyze the performance of the filters on a large number of synthetic trajectories and compare them with (i) the extended Kalman filter based constant velocity and turn rate model and (ii) the linear Kalman filter based constant velocity model. The results show that the proposed filters outperform the other two filters on a wide spectrum of types of motion.

3.1CVOct 1, 2013

Global Localization Based on 3D Planar Surface Segments

Robert Cupec, Emmanuel Karlo Nyarko, Damir Filko et al.

Global localization of a mobile robot using planar surface segments extracted from depth images is considered. The robot's environment is represented by a topological map consisting of local models, each representing a particular location modeled by a set of planar surface segments. The discussed localization approach segments a depth image acquired by a 3D camera into planar surface segments which are then matched to model surface segments. The robot pose is estimated by the Extended Kalman Filter using surface segment pairs as measurements. The reliability and accuracy of the considered approach are experimentally evaluated using a mobile robot equipped by a Microsoft Kinect sensor.