Rodrigo Ramele

LG
h-index3
3papers
2citations
Novelty35%
AI Score36

3 Papers

LGJul 31, 2024Code
Black Box Meta-Learning Intrinsic Rewards

Octavio Pappalardo, Rodrigo Ramele, Juan Miguel Santos

The broader application of reinforcement learning (RL) is limited by challenges including data efficiency, generalization capability, and ability to learn in sparse-reward environments. Meta-learning has emerged as a promising approach to address these issues by optimizing components of the learning algorithm to meet desired characteristics. Additionally, a different line of work has extensively studied the use of intrinsic rewards to enhance the exploration capabilities of algorithms. This work investigates how meta-learning can improve the training signal received by RL agents. We introduce a method to learn intrinsic rewards within a reinforcement learning framework that bypasses the typical computation of meta-gradients through an optimization process by treating policy updates as black boxes. We validate our approach against training with extrinsic rewards, demonstrating its effectiveness, and additionally compare it to the use of a meta-learned advantage function. Experiments are carried out on distributions of continuous control tasks with both parametric and non-parametric variations. Furthermore, only sparse rewards are used during evaluation. Code is available at: https: //github.com/Octavio-Pappalardo/Meta-learning-rewards

LGApr 13
KL Divergence Between Gaussians: A Step-by-Step Derivation for the Variational Autoencoder Objective

Andrés Muñoz, Rodrigo Ramele

Kullback-Leibler (KL) divergence is a fundamental concept in information theory that quantifies the discrepancy between two probability distributions. In the context of Variational Autoencoders (VAEs), it serves as a central regularization term, imposing structure on the latent space and thereby enabling the model to exhibit generative capabilities. In this work, we present a detailed derivation of the closed-form expression for the KL divergence between Gaussian distributions, a case of particular importance in practical VAE implementations. Starting from the general definition for continuous random variables, we derive the expression for the univariate case and extend it to the multivariate setting under the assumption of diagonal covariance. Finally, we discuss the interpretation of each term in the resulting expression and its impact on the training dynamics of the model.

CVJan 14, 2024
A Strong Inductive Bias: Gzip for binary image classification

Marco Scilipoti, Marina Fuster, Rodrigo Ramele

Deep learning networks have become the de-facto standard in Computer Vision for industry and research. However, recent developments in their cousin, Natural Language Processing (NLP), have shown that there are areas where parameter-less models with strong inductive biases can serve as computationally cheaper and simpler alternatives. We propose such a model for binary image classification: a nearest neighbor classifier combined with a general purpose compressor like Gzip. We test and compare it against popular deep learning networks like Resnet, EfficientNet and Mobilenet and show that it achieves better accuracy and utilizes significantly less space, more than two order of magnitude, within a few-shot setting. As a result, we believe that this underlines the untapped potential of models with stronger inductive biases in few-shot scenarios.