Diogo Fernandes Costa Silva

CL
4papers
15citations
Novelty30%
AI Score38

4 Papers

CLNov 8, 2023
Deep Learning Brasil at ABSAPT 2022: Portuguese Transformer Ensemble Approaches

Juliana Resplande Santanna Gomes, Eduardo Augusto Santos Garcia, Adalberto Ferreira Barbosa Junior et al.

Aspect-based Sentiment Analysis (ABSA) is a task whose objective is to classify the individual sentiment polarity of all entities, called aspects, in a sentence. The task is composed of two subtasks: Aspect Term Extraction (ATE), identify all aspect terms in a sentence; and Sentiment Orientation Extraction (SOE), given a sentence and its aspect terms, the task is to determine the sentiment polarity of each aspect term (positive, negative or neutral). This article presents we present our participation in Aspect-Based Sentiment Analysis in Portuguese (ABSAPT) 2022 at IberLEF 2022. We submitted the best performing systems, achieving new state-of-the-art results on both subtasks.

33.5CLMar 16
Tagarela - A Portuguese speech dataset from podcasts

Frederico Santos de Oliveira, Lucas Rafael Stefanel Gris, Alef Iury Siqueira Ferreira et al.

Despite significant advances in speech processing, Portuguese remains under-resourced due to the scarcity of public, large-scale, and high-quality datasets. To address this gap, we present a new dataset, named TAGARELA, composed of over 8,972 hours of podcast audio, specifically curated for training automatic speech recognition (ASR) and text-to-speech (TTS) models. Notably, its scale rivals English's GigaSpeech (10kh), enabling state-of-the-art Portuguese models. To ensure data quality, the corpus was subjected to an audio pre-processing pipeline and subsequently transcribed using a mixed strategy: we applied ASR models that were previously trained on high-fidelity transcriptions generated by proprietary APIs, ensuring a high level of initial accuracy. Finally, to validate the effectiveness of this new resource, we present ASR and TTS models trained exclusively on our dataset and evaluate their performance, demonstrating its potential to drive the development of more robust and natural speech technologies for Portuguese. The dataset is released publicly, available at https://freds0.github.io/TAGARELA/, to foster the development of robust speech technologies.

HCJan 29
Attention Guidance through Video Script: A Case Study of Object Focusing on 360° VR Video Tours

Paulo Vitor Santana Silva, Arthur Ricardo Sousa Vitória, Diogo Fernandes Costa Silva et al.

Within the expansive domain of virtual reality (VR), 360° VR videos immerse viewers in a spherical environment, allowing them to explore and interact with the virtual world from all angles. While this video representation offers unparalleled levels of immersion, it often lacks effective methods to guide viewers' attention toward specific elements within the virtual environment. This paper combines the models Grounding Dino and Segment Anything (SAM) to guide attention by object focusing based on video scripts. As a case study, this work conducts the experiments on a 360° video tour on the University of Reading. The experiment results show that video scripts can improve the user experience in 360° VR Videos Tour by helping in the task of directing the user's attention.

CLJan 7
MiJaBench: Revealing Minority Biases in Large Language Models via Hate Speech Jailbreaking

Iago Alves Brito, Walcy Santos Rezende Rios, Julia Soares Dollis et al.

Current safety evaluations of large language models (LLMs) create a dangerous illusion of universality, aggregating "Identity Hate" into scalar scores that mask systemic vulnerabilities against specific populations. To expose this selective safety, we introduce MiJaBench, a bilingual (English and Portuguese) adversarial benchmark comprising 44,000 prompts across 16 minority groups. By generating 528,000 prompt-response pairs from 12 state-of-the-art LLMs, we curate MiJaBench-Align, revealing that safety alignment is not a generalized semantic capability but a demographic hierarchy: defense rates fluctuate by up to 33\% within the same model solely based on the target group. Crucially, we demonstrate that model scaling exacerbates these disparities, suggesting that current alignment techniques do not create principle of non-discrimination but reinforces memorized refusal boundaries only for specific groups, challenging the current scaling laws of security. We release all datasets and scripts to encourage research into granular demographic alignment at GitHub.