Is the U-Net Directional-Relationship Aware?
This addresses the problem of understanding contextual reasoning limits in CNNs for researchers in computer vision, though it is incremental as it builds on existing U-Net methods.
The study investigated whether a standard U-Net can learn directional relationships between objects for segmentation tasks, finding that with sufficient data and receptive field, it successfully learns and reasons using these relationships.
CNNs are often assumed to be capable of using contextual information about distinct objects (such as their directional relations) inside their receptive field. However, the nature and limits of this capacity has never been explored in full. We explore a specific type of relationship~-- directional~-- using a standard U-Net trained to optimize a cross-entropy loss function for segmentation. We train this network on a pretext segmentation task requiring directional relation reasoning for success and state that, with enough data and a sufficiently large receptive field, it succeeds to learn the proposed task. We further explore what the network has learned by analysing scenarios where the directional relationships are perturbed, and show that the network has learned to reason using these relationships.