Carla Vairetti

LG
h-index7
5papers
57citations
Novelty50%
AI Score27

5 Papers

LGOct 9, 2023
Efficient Hybrid Oversampling and Intelligent Undersampling for Imbalanced Big Data Classification

Carla Vairetti, José Luis Assadi, Sebastián Maldonado

Imbalanced classification is a well-known challenge faced by many real-world applications. This issue occurs when the distribution of the target variable is skewed, leading to a prediction bias toward the majority class. With the arrival of the Big Data era, there is a pressing need for efficient solutions to solve this problem. In this work, we present a novel resampling method called SMOTENN that combines intelligent undersampling and oversampling using a MapReduce framework. Both procedures are performed on the same pass over the data, conferring efficiency to the technique. The SMOTENN method is complemented with an efficient implementation of the neighborhoods related to the minority samples. Our experimental results show the virtues of this approach, outperforming alternative resampling techniques for small- and medium-sized datasets while achieving positive results on large datasets with reduced running times.

LGOct 10, 2023
A predict-and-optimize approach to profit-driven churn prevention

Nuria Gómez-Vargas, Sebastián Maldonado, Carla Vairetti

In this paper, we introduce a novel predict-and-optimize method for profit-driven churn prevention. We frame the task of targeting customers for a retention campaign as a regret minimization problem. The main objective is to leverage individual customer lifetime values (CLVs) to ensure that only the most valuable customers are targeted. In contrast, many profit-driven strategies focus on churn probabilities while considering average CLVs. This often results in significant information loss due to data aggregation. Our proposed model aligns with the guidelines of Predict-and-Optimize (PnO) frameworks and can be efficiently solved using stochastic gradient descent methods. Results from 12 churn prediction datasets underscore the effectiveness of our approach, which achieves the best average performance compared to other well-established strategies in terms of average profit.

LGMay 10, 2024
Scalable Property Valuation Models via Graph-based Deep Learning

Enrique Riveros, Carla Vairetti, Christian Wegmann et al.

This paper aims to enrich the capabilities of existing deep learning-based automated valuation models through an efficient graph representation of peer dependencies, thus capturing intricate spatial relationships. In particular, we develop two novel graph neural network models that effectively identify sequences of neighboring houses with similar features, employing different message passing algorithms. The first strategy consider standard spatial graph convolutions, while the second one utilizes transformer graph convolutions. This approach confers scalability to the modeling process. The experimental evaluation is conducted using a proprietary dataset comprising approximately 200,000 houses located in Santiago, Chile. We show that employing tailored graph neural networks significantly improves the accuracy of house price prediction, especially when utilizing transformer convolutional message passing layers.

LGMay 30, 2023
OWAdapt: An adaptive loss function for deep learning using OWA operators

Sebastián Maldonado, Carla Vairetti, Katherine Jara et al.

In this paper, we propose a fuzzy adaptive loss function for enhancing deep learning performance in classification tasks. Specifically, we redefine the cross-entropy loss to effectively address class-level noise conditions, including the challenging problem of class imbalance. Our approach introduces aggregation operators, leveraging the power of fuzzy logic to improve classification accuracy. The rationale behind our proposed method lies in the iterative up-weighting of class-level components within the loss function, focusing on those with larger errors. To achieve this, we employ the ordered weighted average (OWA) operator and combine it with an adaptive scheme for gradient-based learning. Through extensive experimentation, our method outperforms other commonly used loss functions, such as the standard cross-entropy or focal loss, across various binary and multiclass classification tasks. Furthermore, we explore the influence of hyperparameters associated with the OWA operators and present a default configuration that performs well across different experimental settings.

LGMay 16, 2023
One-step learning algorithm selection for classification via convolutional neural networks

Sebastian Maldonado, Carla Vairetti, Ignacio Figueroa

As with any task, the process of building machine learning models can benefit from prior experience. Meta-learning for classifier selection leverages knowledge about the characteristics of different datasets and/or the past performance of machine learning techniques to inform better decisions in the current modeling process. Traditional meta-learning approaches first collect metadata that describe this prior experience and then use it as input for an algorithm selection model. In this paper, however, a one-step scheme is proposed in which convolutional neural networks are trained directly on tabular datasets for binary classification. The aim is to learn the underlying structure of the data without the need to explicitly identify meta-features. Experiments with simulated datasets show that the proposed approach achieves near-perfect performance in identifying both linear and nonlinear patterns, outperforming the conventional two-step method based on meta-features. The method is further applied to real-world datasets, providing recommendations on the most suitable classifiers based on the data's inherent structure.