João Medrado Gondim

8.7CVSep 28, 2024Code

FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models

Diego A. B. Moreira, Alef Iury Ferreira, Jhessica Silva et al.

Despite significant advancements and pervasive use of vision-language models, a paucity of studies has addressed their ethical implications. These models typically require extensive training data, often from hastily reviewed text and image datasets, leading to highly imbalanced datasets and ethical concerns. Additionally, models initially trained in English are frequently fine-tuned for other languages, such as the CLIP model, which can be expanded with more data to enhance capabilities but can add new biases. The CAPIVARA, a CLIP-based model adapted to Portuguese, has shown strong performance in zero-shot tasks. In this paper, we evaluate four different types of discriminatory practices within visual-language models and introduce FairPIVARA, a method to reduce them by removing the most affected dimensions of feature embeddings. The application of FairPIVARA has led to a significant reduction of up to 98% in observed biases while promoting a more balanced word distribution within the model. Our model and code are available at: https://github.com/hiaac-nlp/FairPIVARA.

6.6CRSep 23, 2021

A Novel Open Set Energy-based Flow Classifier for Network Intrusion Detection

Manuela M. C. Souza, Camila Pontes, Joao Gondim et al.

Several machine learning-based Network Intrusion Detection Systems (NIDS) have been proposed in recent years. Still, most of them were developed and evaluated under the assumption that the training context is similar to the test context. This assumption is false in real networks, given the emergence of new attacks and variants of known attacks. To deal with this reality, the open set recognition field, which is the most general task of recognizing classes not seen during training in any domain, began to gain importance in machine learning based NIDS research. Yet, existing solutions are often bound to high temporal complexities and performance bottlenecks. In this work, we propose an algorithm to be used in NIDS that performs open set recognition. Our proposal is an adaptation of the single-class Energy-based Flow Classifier (EFC), which proved to be an algorithm with strong generalization capability and low computational cost. The new version of EFC correctly classifies not only known attacks, but also unknown ones, and differs from other proposals from the literature by presenting a single layer with low temporal complexity. Our proposal was evaluated against well-established multi-class algorithms and as an open set classifier. It proved to be an accurate classifier in both evaluations, similar to the state of the art. As a conclusion of our work, we consider EFC a promising algorithm to be used in NIDS for its high performance and applicability in real networks.

João Medrado Gondim

2 Papers