LGAug 12, 2024
Finding Patterns in Ambiguity: Interpretable Stress Testing in the Decision~BoundaryInês Gomes, Luís F. Teixeira, Jan N. van Rijn et al.
The increasing use of deep learning across various domains highlights the importance of understanding the decision-making processes of these black-box models. Recent research focusing on the decision boundaries of deep classifiers, relies on generated synthetic instances in areas of low confidence, uncovering samples that challenge both models and humans. We propose a novel approach to enhance the interpretability of deep binary classifiers by selecting representative samples from the decision boundary - prototypes - and applying post-model explanation algorithms. We evaluate the effectiveness of our approach through 2D visualizations and GradientSHAP analysis. Our experiments demonstrate the potential of the proposed method, revealing distinct and compact clusters and diverse prototypes that capture essential features that lead to low-confidence decisions. By offering a more aggregated view of deep classifiers' decision boundaries, our work contributes to the responsible development and deployment of reliable machine learning systems.
LGJan 28
Exploring Transformer Placement in Variational Autoencoders for Tabular Data GenerationAníbal Silva, Moisés Santos, André Restivo et al.
Tabular data remains a challenging domain for generative models. In particular, the standard Variational Autoencoder (VAE) architecture, typically composed of multilayer perceptrons, struggles to model relationships between features, especially when handling mixed data types. In contrast, Transformers, through their attention mechanism, are better suited for capturing complex feature interactions. In this paper, we empirically investigate the impact of integrating Transformers into different components of a VAE. We conduct experiments on 57 datasets from the OpenML CC18 suite and draw two main conclusions. First, results indicate that positioning Transformers to leverage latent and decoder representations leads to a trade-off between fidelity and diversity. Second, we observe a high similarity between consecutive blocks of a Transformer in all components. In particular, in the decoder, the relationship between the input and output of a Transformer is approximately linear.
LGApr 25, 2024
Online Data Augmentation for Forecasting with Deep LearningVitor Cerqueira, Moisés Santos, Luis Roque et al.
Deep learning approaches are increasingly used to tackle forecasting tasks involving datasets with multiple univariate time series. A key factor in the successful application of these methods is a large enough training sample size, which is not always available. Synthetic data generation techniques can be applied in these scenarios to augment the dataset. Data augmentation is typically applied offline before training a model. However, when training with mini-batches, some batches may contain a disproportionate number of synthetic samples that do not align well with the original data characteristics. This work introduces an online data augmentation framework that generates synthetic samples during the training of neural networks. By creating synthetic samples for each batch alongside their original counterparts, we maintain a balanced representation between real and synthetic data throughout the training process. This approach fits naturally with the iterative nature of neural network training and eliminates the need to store large augmented datasets. We validated the proposed framework using 3797 time series from 6 benchmark datasets, three neural architectures, and seven synthetic data generation techniques. The experiments suggest that online data augmentation leads to better forecasting performance compared to offline data augmentation or no augmentation approaches. The framework and experiments are publicly available.
LGDec 6, 2024
Tabular data generation with tensor contraction layers and transformersAníbal Silva, André Restivo, Moisés Santos et al.
Generative modeling for tabular data has recently gained significant attention in the Deep Learning domain. Its objective is to estimate the underlying distribution of the data. However, estimating the underlying distribution of tabular data has its unique challenges. Specifically, this data modality is composed of mixed types of features, making it a non-trivial task for a model to learn intra-relationships between them. One approach to address mixture is to embed each feature into a continuous matrix via tokenization, while a solution to capture intra-relationships between variables is via the transformer architecture. In this work, we empirically investigate the potential of using embedding representations on tabular data generation, utilizing tensor contraction layers and transformers to model the underlying distribution of tabular data within Variational Autoencoders. Specifically, we compare four architectural approaches: a baseline VAE model, two variants that focus on tensor contraction layers and transformers respectively, and a hybrid model that integrates both techniques. Our empirical study, conducted across multiple datasets from the OpenML CC18 suite, compares models over density estimation and Machine Learning efficiency metrics. The main takeaway from our results is that leveraging embedding representations with the help of tensor contraction layers improves density estimation metrics, albeit maintaining competitive performance in terms of machine learning efficiency.
LGDec 3, 2023
Enhancing Algorithm Performance Understanding through tsMorph: Generating Semi-Synthetic Time Series for Robust Forecasting EvaluationMoisés Santos, André de Carvalho, Carlos Soares
Time series forecasting is a subject of significant scientific and industrial importance. Despite the widespread utilization of forecasting methods, there is a dearth of research aimed at comprehending the conditions under which these methods yield favorable or unfavorable performances. Empirical studies, although common, are challenged by the limited availability of time series datasets, restricting the extraction of reliable insights. To address this limitation, we present tsMorph, a tool for generating semi-synthetic time series through dataset morphing. tsMorph works by creating a sequence of datasets from two original datasets. The characteristics of the generated datasets progressively depart from those of one of the datasets and converge toward the attributes of the other dataset. This method provides a valuable alternative for obtaining substantial datasets. In this paper, we show the benefits of tsMorph by assessing the predictive performance of the Long Short-Term Memory Network and DeepAR forecasting algorithms. The time series used for the experiments comes from the NN5 Competition. The experimental results provide important insights. Notably, the performances of the two algorithms improve proportionally with the frequency of the time series. These experiments confirm that tsMorph can be an effective tool for better understanding the behavior of forecasting algorithms, delivering a pathway to overcoming the limitations posed by empirical studies and enabling more extensive and reliable experiments.