CLMar 15, 2023
Cross-domain Sentiment Classification in SpanishLautaro Estienne, Matias Vera, Leonardo Rey Vega
Sentiment Classification is a fundamental task in the field of Natural Language Processing, and has very important academic and commercial applications. It aims to automatically predict the degree of sentiment present in a text that contains opinions and subjectivity at some level, like product and movie reviews, or tweets. This can be really difficult to accomplish, in part, because different domains of text contains different words and expressions. In addition, this difficulty increases when text is written in a non-English language due to the lack of databases and resources. As a consequence, several cross-domain and cross-language techniques are often applied to this task in order to improve the results. In this work we perform a study on the ability of a classification system trained with a large database of product reviews to generalize to different Spanish domains. Reviews were collected from the MercadoLibre website from seven Latin American countries, allowing the creation of a large and balanced dataset. Results suggest that generalization across domains is feasible though very challenging when trained with these product reviews, and can be improved by pre-training and fine-tuning the classification model.
62.5AIMay 19
\ECUAS{n}: A family of metrics for principled evaluation of uncertainty-augmented systemsLautaro Estienne, Erik Ernst, Matías Vera et al.
In high-stakes automated decision-making, access to predictive uncertainty is essential for enabling users -- human or downstream systems -- to accept or reject predictions based on application-specific cost trade-offs. Such uncertainty-augmented (UA) systems -- i.e., systems that output both predictions and uncertainty scores -- are currently being assessed in the literature in a variety of ways, using separate metrics to evaluate the predictions and the uncertainty scores, setting a cost function with a fixed rejection cost or integrating over a coverage-risk curve. We argue that these evaluation approaches are inadequate for assessing overall performance of the UA system for decision making under uncertainty and propose a novel family of metrics, \ECUAS{n}, formulated as proper scoring rules for the task of interest. The parameter $n$ controls the trade-off between the cost of incorrect predictions and imperfect uncertainties depending on the needs of the use-case. We demonstrate the advantages of the \ECUAS{n} metrics both theoretically and empirically, through experiments on diverse classification and generation datasets, including a manually annotated subset of TriviaQA.
LGJan 5, 2024Code
On the Stability of a non-hyperbolic nonlinear map with non-bounded set of non-isolated fixed points with applications to Machine LearningRoberta Hansen, Matias Vera, Lautaro Estienne et al.
This paper deals with the convergence analysis of the SUCPA (Semi Unsupervised Calibration through Prior Adaptation) algorithm, defined from a first-order non-linear difference equations, first developed to correct the scores output by a supervised machine learning classifier. The convergence analysis is addressed as a dynamical system problem, by studying the local and global stability of the nonlinear map derived from the algorithm. This map, which is defined by a composition of exponential and rational functions, turns out to be non-hyperbolic with a non-bounded set of non-isolated fixed points. Hence, a non-standard method for solving the convergence analysis is used consisting of an ad-hoc geometrical approach. For a binary classification problem (two-dimensional map), we rigorously prove that the map is globally asymptotically stable. Numerical experiments on real-world application are performed to support the theoretical results by means of two different classification problems: Sentiment Polarity performed with a Large Language Model and Cat-Dog Image classification. For a greater number of classes, the numerical evidence shows the same behavior of the algorithm, and this is illustrated with a Natural Language Inference example. The experiment codes are publicly accessible online at the following repository: https://github.com/LautaroEst/sucpa-convergence
CLJul 13, 2023
Unsupervised Calibration through Prior Adaptation for Text Classification using Large Language ModelsLautaro Estienne, Luciana Ferrer, Matías Vera et al.
A wide variety of natural language tasks are currently being addressed with large-scale language models (LLMs). These models are usually trained with a very large amount of unsupervised text data and adapted to perform a downstream natural language task using methods like fine-tuning, calibration or in-context learning. In this work, we propose an approach to adapt the prior class distribution to perform text classification tasks without the need for labelled samples and only few in-domain sample queries. The proposed approach treats the LLM as a black box, adding a stage where the model posteriors are calibrated to the task. Results show that these methods outperform the un-adapted model for different number of training shots in the prompt and a previous approach were calibration is performed without using any adaptation data.
NCMar 15, 2023
Towards an Hybrid Hodgkin-Huxley Action Potential Generation ModelLautaro Estienne
Mathematical models for the generation of the action potential can improve the understanding of physiological mechanisms that are consequence of the electrical activity in neurons. In such models, some equations involving empirically obtained functions of the membrane potential are usually defined. The best known of these models, the Hodgkin-Huxley model, is an example of this paradigm since it defines the conductances of ion channels in terms of the opening and closing rates of each type of gate present in the channels. These functions need to be derived from laboratory measurements that are often very expensive and produce little data because they involve a time-space-independent measurement of the voltage in a single channel of the cell membrane. In this work, we investigate the possibility of finding the Hodgkin-Huxley model's parametric functions using only two simple measurements (the membrane voltage as a function of time and the injected current that triggered that voltage) and applying Deep Learning methods to estimate these functions. This would result in an hybrid model of the action potential generation composed by the original Hodgkin-Huxley equations and an Artificial Neural Network that requires a small set of easy-to-perform measurements to be trained. Experiments were carried out using data generated from the original Hodgkin-Huxley model, and results show that a simple two-layer artificial neural network (ANN) architecture trained on a minimal amount of data can learn to model some of the fundamental proprieties of the action potential generation by estimating the model's rate functions.
CLJul 18, 2025
Collaborative Rational Speech Act: Pragmatic Reasoning for Multi-Turn DialogLautaro Estienne, Gabriel Ben Zenou, Nona Naderi et al.
As AI systems take on collaborative roles, they must reason about shared goals and beliefs-not just generate fluent language. The Rational Speech Act (RSA) framework offers a principled approach to pragmatic reasoning, but existing extensions face challenges in scaling to multi-turn, collaborative scenarios. In this paper, we introduce Collaborative Rational Speech Act (CRSA), an information-theoretic (IT) extension of RSA that models multi-turn dialog by optimizing a gain function adapted from rate-distortion theory. This gain is an extension of the gain model that is maximized in the original RSA model but takes into account the scenario in which both agents in a conversation have private information and produce utterances conditioned on the dialog. We demonstrate the effectiveness of CRSA on referential games and template-based doctor-patient dialogs in the medical domain. Empirical results show that CRSA yields more consistent, interpretable, and collaborative behavior than existing baselines-paving the way for more pragmatic and socially aware language agents.