Matteo Bianchi

RO
h-index3
7papers
145citations
Novelty39%
AI Score43

7 Papers

11.2CVMay 19Code
A Framework for Evaluating Zero-Shot Image Generation in Concept-based Explainability

Giacomo Astolfi, Matteo Bianchi, Riccardo Campi et al.

Concept-based Explainable Artificial Intelligence (XAI) interprets deep learning models using human-understandable visual features (e.g., textures or object parts) by linking internal representations to class predictions, thereby bridging the gap between low-level image data and high-level semantics. A major challenge, however, is the reliance on large sets of labeled images to represent each concept, which limits scalability. In this work, we investigate the use of zero-shot Text-to-Image (T2I) generative models as a source of synthetic concept datasets for concept-based XAI methods. Specifically, we generate concepts using predefined prompts and evaluate their faithfulness to real ones through four complementary analyses: (1) comparing synthetic vs. real concept images via concept representation similarity; (2) evaluating their intra-similarity by comparing pairs of subsets of the same concept with progressively increasing size; (3) evaluating their performance for downstream explanation tasks using relevant class images; (4) evaluating how removing a concept from tested class images affects explanations of generated concepts. While current T2I generative models promise a shortcut to concept-based XAI, our study highlights challenges and raises open questions about the use of synthetic data generated by zero-shot pipelines in model analyses. The resulting dataset is available at https://github.com/DataSciencePolimi/ZeroShot-T2I-Concepts.

CVNov 8, 2024Code
Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification

Antonio De Santis, Riccardo Campi, Matteo Bianchi et al.

Convolutional Neural Networks (CNNs) have seen significant performance improvements in recent years. However, due to their size and complexity, they function as black-boxes, leading to transparency concerns. State-of-the-art saliency methods generate local explanations that highlight the area in the input image where a class is identified but cannot explain how a concept of interest contributes to the prediction, which is essential for bias mitigation. On the other hand, concept-based methods, such as TCAV (Testing with Concept Activation Vectors), provide insights into how sensitive is the network to a concept, but cannot compute its attribution in a specific prediction nor show its location within the input image. This paper introduces a novel post-hoc explainability framework, Visual-TCAV, which aims to bridge the gap between these methods by providing both local and global explanations for CNN-based image classification. Visual-TCAV uses Concept Activation Vectors (CAVs) to generate saliency maps that show where concepts are recognized by the network. Moreover, it can estimate the attribution of these concepts to the output of any class using a generalization of Integrated Gradients. This framework is evaluated on popular CNN architectures, with its validity further confirmed via experiments where ground truth for explanations is known, and a comparison with TCAV. Our code is available at https://github.com/DataSciencePolimi/Visual-TCAV.

LGMay 6, 2024
Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification

Matteo Bianchi, Antonio De Santis, Andrea Tocchetti et al.

Transparency and explainability in image classification are essential for establishing trust in machine learning models and detecting biases and errors. State-of-the-art explainability methods generate saliency maps to show where a specific class is identified, without providing a detailed explanation of the model's decision process. Striving to address such a need, we introduce a post-hoc method that explains the entire feature extraction process of a Convolutional Neural Network. These explanations include a layer-wise representation of the features the model extracts from the input. Such features are represented as saliency maps generated by clustering and merging similar feature maps, to which we associate a weight derived by generalizing Grad-CAM for the proposed methodology. To further enhance these explanations, we include a set of textual labels collected through a gamified crowdsourcing activity and processed using NLP techniques and Sentence-BERT. Finally, we show an approach to generate global explanations by aggregating labels across multiple images.

ROFeb 5, 2021
Towards integrated tactile sensorimotor control in anthropomorphic soft robotic hands

Nathan F. Lepora, Andrew Stinchcombe, Chris Ford et al.

In this work, we report on the integrated sensorimotor control of the Pisa/IIT SoftHand, an anthropomorphic soft robot hand designed around the principle of adaptive synergies, with the BRL tactile fingertip (TacTip), a soft biomimetic optical tactile sensor based on the human sense of touch. Our focus is how a sense of touch can be used to control an anthropomorphic hand with one degree of actuation, based on an integration that respects the hand's mechanical functionality. We consider: (i) closed-loop tactile control to establish a light contact on an unknown held object, based on the structural similarity with an undeformed tactile image; and (ii) controlling the estimated pose of an edge feature of a held object, using a convolutional neural network approach developed for controlling other sensors in the TacTip family. Overall, this gives a foundation to endow soft robotic hands with human-like touch, with implications for autonomous grasping, manipulation, human-robot interaction and prosthetics. Supplemental video: https://youtu.be/ndsxj659bkQ

ROSep 8, 2016
Latest Datasets and Technologies Presented in the Workshop on Grasping and Manipulation Datasets

Matteo Bianchi, Jeannette Bohg, Yu Sun

This paper reports the activities and outcomes in the Workshop on Grasping and Manipulation Datasets that was organized under the International Conference on Robotics and Automation (ICRA) 2016. The half day workshop was packed with nine invited talks, 12 interactive presentations, and one panel discussion with ten panelists. This paper summarizes all the talks and presentations and recaps what has been discussed in the panels session. This summary servers as a review of recent developments in data collection in grasping and manipulation. Many of the presentations describe ongoing efforts or explorations that could be achieved and fully available in a year or two. The panel discussion not only commented on the current approaches, but also indicates new directions and focuses. The workshop clearly displayed the importance of quality datasets in robotics and robotic grasping and manipulation field. Hopefully the workshop could motivate larger efforts to create big datasets that are comparable with big datasets in other communities such as computer vision.

ROJun 4, 2012
Synergy-Based Hand Pose Sensing: Optimal Glove Design

Matteo Bianchi, Paolo Salaris, Antonio Bicchi

In this paper we study the problem of improving human hand pose sensing device performance by exploiting the knowledge on how humans most frequently use their hands in grasping tasks. In a companion paper we studied the problem of maximizing the reconstruction accuracy of the hand pose from partial and noisy data provided by any given pose sensing device (a sensorized "glove") taking into account statistical a priori information. In this paper we consider the dual problem of how to design pose sensing devices, i.e. how and where to place sensors on a glove, to get maximum information about the actual hand posture. We study the continuous case, whereas individual sensing elements in the glove measure a linear combination of joint angles, the discrete case, whereas each measure corresponds to a single joint angle, and the most general hybrid case, whereas both continuous and discrete sensing elements are available. The objective is to provide, for given a priori information and fixed number of measurements, the optimal design minimizing in average the reconstruction error. Solutions relying on the geometrical synergy definition as well as gradient flow-based techniques are provided. Simulations of reconstruction performance show the effectiveness of the proposed optimal design.

ROJun 4, 2012
Synergy-based Hand Pose Sensing: Reconstruction Enhancement

Matteo Bianchi, Paolo Salaris, Antonio Bicchi

Low-cost sensing gloves for reconstruction posture provide measurements which are limited under several regards. They are generated through an imperfectly known model, are subject to noise, and may be less than the number of Degrees of Freedom (DoFs) of the hand. Under these conditions, direct reconstruction of the hand posture is an ill-posed problem, and performance can be very poor. This paper examines the problem of estimating the posture of a human hand using(low-cost) sensing gloves, and how to improve their performance by exploiting the knowledge on how humans most frequently use their hands. To increase the accuracy of pose reconstruction without modifying the glove hardware - hence basically at no extra cost - we propose to collect, organize, and exploit information on the probabilistic distribution of human hand poses in common tasks. We discuss how a database of such an a priori information can be built, represented in a hierarchy of correlation patterns or postural synergies, and fused with glove data in a consistent way, so as to provide a good hand pose reconstruction in spite of insufficient and inaccurate sensing data. Simulations and experiments on a low-cost glove are reported which demonstrate the effectiveness of the proposed techniques.