LG AIJun 5, 2023

Learning Causal Mechanisms through Orthogonal Neural Networks

Peyman Sheikholharam Mashhadi, Slawomir Nowaczyk

arXiv:2306.03938v12.0h-index: 22Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of improving machine intelligence in modularized inference, which is incremental over existing disentanglement methods.

The paper tackles the problem of unsupervised learning of independent generative mechanisms from distorted data by proposing an adversarial method with orthogonalization and expert relocation, achieving significantly better separability and faster convergence.

A fundamental feature of human intelligence is the ability to infer high-level abstractions from low-level sensory data. An essential component of such inference is the ability to discover modularized generative mechanisms. Despite many efforts to use statistical learning and pattern recognition for finding disentangled factors, arguably human intelligence remains unmatched in this area. In this paper, we investigate a problem of learning, in a fully unsupervised manner, the inverse of a set of independent mechanisms from distorted data points. We postulate, and justify this claim with experimental results, that an important weakness of existing machine learning solutions lies in the insufficiency of cross-module diversification. Addressing this crucial discrepancy between human and machine intelligence is an important challenge for pattern recognition systems. To this end, our work proposes an unsupervised method that discovers and disentangles a set of independent mechanisms from unlabeled data, and learns how to invert them. A number of experts compete against each other for individual data points in an adversarial setting: one that best inverses the (unknown) generative mechanism is the winner. We demonstrate that introducing an orthogonalization layer into the expert architectures enforces additional diversity in the outputs, leading to significantly better separability. Moreover, we propose a procedure for relocating data points between experts to further prevent any one from claiming multiple mechanisms. We experimentally illustrate that these techniques allow discovery and modularization of much less pronounced transformations, in addition to considerably faster convergence.

View on arXiv PDF Code

Similar