Antonio Rangel

CV
h-index15
3papers
15citations
Novelty57%
AI Score28

3 Papers

LGJul 15, 2024
Disentangling Representations through Multi-task Learning

Pantelis Vafidis, Aman Bhargava, Antonio Rangel

Intelligent perception and interaction with the world hinges on internal representations that capture its underlying structure (''disentangled'' or ''abstract'' representations). Disentangled representations serve as world models, isolating latent factors of variation in the world along approximately orthogonal directions, thus facilitating feature-based generalization. We provide experimental and theoretical results guaranteeing the emergence of disentangled representations in agents that optimally solve multi-task evidence accumulation classification tasks, canonical in the neuroscience literature. The key conceptual finding is that, by producing accurate multi-task classification estimates, a system implicitly represents a set of coordinates specifying a disentangled representation of the underlying latent state of the data it receives. The theory provides conditions for the emergence of these representations in terms of noise, number of tasks, and evidence accumulation time. We experimentally validate these predictions in RNNs trained to multi-task, which learn disentangled representations in the form of continuous attractors, leading to zero-shot out-of-distribution (OOD) generalization in predicting latent factors. We demonstrate the robustness of our framework across autoregressive architectures, decision boundary geometries and in tasks requiring classification confidence estimation. We find that transformers are particularly suited for disentangling representations, which might explain their unique world understanding abilities. Overall, our framework establishes a formal link between competence at multiple tasks and the formation of disentangled, interpretable world models in both biological and artificial systems, and helps explain why ANNs often arrive at human-interpretable concepts, and how they both may acquire exceptional zero-shot generalization capabilities.

NCSep 20, 2024
Stimulus-to-Stimulus Learning in RNNs with Cortical Inductive Biases

Pantelis Vafidis, Antonio Rangel

Animals learn to predict external contingencies from experience through a process of conditioning. A natural mechanism for conditioning is stimulus substitution, whereby the neuronal response to a stimulus with no prior behavioral significance becomes increasingly identical to that generated by a behaviorally significant stimulus it reliably predicts. We propose a recurrent neural network model of stimulus substitution which leverages two forms of inductive bias pervasive in the cortex: representational inductive bias in the form of mixed stimulus representations, and architectural inductive bias in the form of two-compartment pyramidal neurons that have been shown to serve as a fundamental unit of cortical associative learning. The properties of these neurons allow for a biologically plausible learning rule that implements stimulus substitution, utilizing only information available locally at the synapses. We show that the model generates a wide array of conditioning phenomena, and can learn large numbers of associations with an amount of training commensurate with animal experiments, without relying on parameter fine-tuning for each individual experimental task. In contrast, we show that commonly used Hebbian rules fail to learn generic stimulus-stimulus associations with mixed selectivity, and require task-specific parameter fine-tuning. Our framework highlights the importance of multi-compartment neuronal processing in the cortex, and showcases how it might confer cortical animals the evolutionary edge.

CVJan 17, 2024
Land Cover Image Classification

Antonio Rangel, Juan Terven, Diana M. Cordova-Esparza et al.

Land Cover (LC) image classification has become increasingly significant in understanding environmental changes, urban planning, and disaster management. However, traditional LC methods are often labor-intensive and prone to human error. This paper explores state-of-the-art deep learning models for enhanced accuracy and efficiency in LC analysis. We compare convolutional neural networks (CNN) against transformer-based methods, showcasing their applications and advantages in LC studies. We used EuroSAT, a patch-based LC classification data set based on Sentinel-2 satellite images and achieved state-of-the-art results using current transformer models.