Align-Deform-Subtract: An Interventional Framework for Explaining Object Differences
This addresses the need for interpretable explanations of object differences in computer vision, though it appears incremental as it builds on existing interventional and alignment methods.
The paper tackled the problem of explaining differences between two object images in terms of underlying object properties by proposing the Align-Deform-Subtract (ADS) framework, which iteratively quantifies and removes these differences to produce disentangled error measures, with experiments demonstrating its efficacy on real and synthetic data.
Given two object images, how can we explain their differences in terms of the underlying object properties? To address this question, we propose Align-Deform-Subtract (ADS) -- an interventional framework for explaining object differences. By leveraging semantic alignments in image-space as counterfactual interventions on the underlying object properties, ADS iteratively quantifies and removes differences in object properties. The result is a set of "disentangled" error measures which explain object differences in terms of the underlying properties. Experiments on real and synthetic data illustrate the efficacy of the framework.