MMMar 3, 2022
Audio-Visual Object Classification for Human-Robot CollaborationA. Xompero, Y. L. Pang, T. Patten et al.
Human-robot collaboration requires the contactless estimation of the physical properties of containers manipulated by a person, for example while pouring content in a cup or moving a food box. Acoustic and visual signals can be used to estimate the physical properties of such objects, which may vary substantially in shape, material and size, and also be occluded by the hands of the person. To facilitate comparisons and stimulate progress in solving this problem, we present the CORSMAL challenge and a dataset to assess the performance of the algorithms through a set of well-defined performance scores. The tasks of the challenge are the estimation of the mass, capacity, and dimensions of the object (container), and the classification of the type and amount of its content. A novel feature of the challenge is our real-to-simulation framework for visualising and assessing the impact of estimation errors in human-to-robot handovers.
RONov 20, 2020
Probabilistic Radio-Visual Active Sensing for Search and TrackingL. Varotto, A. Cenedese, A. Cavallaro
Active Search and Tracking for search and rescue missions or collaborative mobile robotics relies on the actuation of a sensing platform to detect and localize a target. In this paper we focus on visually detecting a radio-emitting target with an aerial robot equipped with a radio receiver and a camera. Visual-based tracking provides high accuracy, but the directionality of the sensing domain may require long search times before detecting the target. Conversely, radio signals have larger coverage, but lower tracking accuracy. Thus, we design a Recursive Bayesian Estimation scheme that uses camera observations to refine radio measurements. To regulate the camera pose, we design an optimal controller whose cost function is built upon a probabilistic map. Theoretical results support the proposed algorithm, while numerical analyses show higher robustness and efficiency with respect to visual and radio-only baselines.