Sarajane Marques Peres

h-index16

7papers

54citations

Novelty18%

AI Score28

Ranked #149,497 of 194,257 authors (top 77%)#26,267 in CL (top 85%)

7 Papers

0.9CLSep 19, 2023

Benchmarks for Pirá 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change

Paulo Pirozelli, Marcos M. José, Igor Silveira et al.

Pirá is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of baselines has not yet been developed for Pirá. By creating these baselines, researchers can more easily utilize Pirá as a resource for testing machine learning models across a wide range of question answering tasks. In this paper, we define six benchmarks over the Pirá dataset, covering closed generative question answering, machine reading comprehension, information retrieval, open question answering, answer triggering, and multiple choice question answering. As part of this effort, we have also produced a curated version of the original dataset, where we fixed a number of grammar issues, repetitions, and other shortcomings. Furthermore, the dataset has been extended in several new directions, so as to face the aforementioned benchmarks: translation of supporting texts from English into Portuguese, classification labels for answerability, automatic paraphrases of questions and answers, and multiple choice candidates. The results described in this paper provide several points of reference for researchers interested in exploring the challenges provided by the Pirá dataset.

2.5AISep 6, 2022

The BLue Amazon Brain (BLAB): A Modular Architecture of Services about the Brazilian Maritime Territory

Paulo Pirozelli, Ais B. R. Castro, Ana Luiza C. de Oliveira et al.

We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided by BLAB is a conversational facility that deals with complex questions about the Blue Amazon, called BLAB-Chat; its central component is a controller that manages several task-oriented natural language processing modules (e.g., question answering and summarizer systems). These modules have access to an internal data lake as well as to third-party databases. A news reporter (BLAB-Reporter) and a purposely-developed wiki (BLAB-Wiki) are also part of the BLAB service architecture. In this paper, we describe our current version of BLAB's architecture (interface, backend, web services, NLP modules, and resources) and comment on the challenges we have faced so far, such as the lack of training data and the scattered state of domain information. Solving these issues presents a considerable challenge in the development of artificial intelligence for technical domains.

4.9CLJan 23, 2025

A RAG-Based Institutional Assistant

Gustavo Kuratomi, Paulo Pirozelli, Fabio G. Cozman et al.

Although large language models (LLMs) demonstrate strong text generation capabilities, they struggle in scenarios requiring access to structured knowledge bases or specific documents, limiting their effectiveness in knowledge-intensive tasks. To address this limitation, retrieval-augmented generation (RAG) models have been developed, enabling generative models to incorporate relevant document fragments into their inputs. In this paper, we design and evaluate a RAG-based virtual assistant specifically tailored for the University of São Paulo. Our system architecture comprises two key modules: a retriever and a generative model. We experiment with different types of models for both components, adjusting hyperparameters such as chunk size and the number of retrieved documents. Our optimal retriever model achieves a Top-5 accuracy of 30%, while our most effective generative model scores 22.04\% against ground truth answers. Notably, when the correct document chunks are supplied to the LLMs, accuracy significantly improves to 54.02%, an increase of over 30 percentage points. Conversely, without contextual input, performance declines to 13.68%. These findings highlight the critical role of database access in enhancing LLM performance. They also reveal the limitations of current semantic search methods in accurately identifying relevant documents and underscore the ongoing challenges LLMs face in generating precise responses.

3.6IRAug 18, 2025

Text-to-SQL Oriented to the Process Mining Domain: A PT-EN Dataset for Query Translation

Bruno Yui Yamate, Thais Rodrigues Neubauer, Marcelo Fantinato et al.

This paper introduces text-2-SQL-4-PM, a bilingual (Portuguese-English) benchmark dataset designed for the text-to-SQL task in the process mining domain. Text-to-SQL conversion facilitates natural language querying of databases, increasing accessibility for users without SQL expertise and productivity for those that are experts. The text-2-SQL-4-PM dataset is customized to address the unique challenges of process mining, including specialized vocabularies and single-table relational structures derived from event logs. The dataset comprises 1,655 natural language utterances, including human-generated paraphrases, 205 SQL statements, and ten qualifiers. Methods include manual curation by experts, professional translations, and a detailed annotation process to enable nuanced analyses of task complexity. Additionally, a baseline study using GPT-3.5 Turbo demonstrates the feasibility and utility of the dataset for text-to-SQL applications. The results show that text-2-SQL-4-PM supports evaluation of text-to-SQL implementations, offering broader applicability for semantic parsing and other natural language processing tasks.

1.1CLFeb 4, 2022Code

Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean

André F. A. Paschoal, Paulo Pirozelli, Valdinei Freire et al.

Current research in natural language processing is highly dependent on carefully produced corpora. Most existing resources focus on English; some resources focus on languages such as Chinese and French; few resources deal with more than one language. This paper presents the Pirá dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English. Pirá is, to the best of our knowledge, the first QA dataset with supporting texts in Portuguese, and, perhaps more importantly, the first bilingual QA dataset that includes this language. The Pirá dataset consists of 2261 properly curated question/answer (QA) sets in both languages. The QA sets were manually created based on two corpora: abstracts related to the Brazilian coast and excerpts of United Nation reports about the ocean. The QA sets were validated in a peer-review process with the dataset contributors. We discuss some of the advantages as well as limitations of Pirá, as this new resource can support a set of tasks in NLP such as question-answering, information retrieval, and machine translation.

1.7IRJul 10, 2018

Limits to Surprise in Recommender Systems

Andre Paulino de Lima, Sarajane Marques Peres

In this study, we address the challenge of measuring the ability of a recommender system to make surprising recommendations. Although current evaluation methods make it possible to determine if two algorithms can make recommendations with a significant difference in their average surprise measure, it could be of interest to our community to know how competent an algorithm is at embedding surprise in its recommendations, without having to resort to making a direct comparison with another algorithm. We argue that a) surprise is a finite resource in a recommender system, b) there is a limit to how much surprise any algorithm can embed in a recommendation, and c) this limit can provide us with a scale against which the performance of any algorithm can be measured. By exploring these ideas, it is possible to define the concepts of maximum and minimum potential surprise and design a surprise metric called "normalised surprise" that employs these limits to potential surprise. Two experiments were conducted to test the proposed metric. The aim of the first was to validate the quality of the estimates of minimum and maximum potential surprise produced by a greedy algorithm. The purpose of the second experiment was to analyse the behaviour of the proposed metric using the MovieLens dataset. The results confirmed the behaviour that was expected, and showed that the proposed surprise metric is both effective and consistent for differing choices of recommendation algorithms, data representations and distance functions.

5.5HCDec 17, 2013

A Review of Temporal Aspects of Hand Gesture Analysis Applied to Discourse Analysis and Natural Conversation

Renata Cristina Barros Madeo, Priscilla Koch Wagner, Sarajane Marques Peres

Lately, there has been an increasing interest in hand gesture analysis systems. Recent works have employed pattern recognition techniques and have focused on the development of systems with more natural user interfaces. These systems may use gestures to control interfaces or recognize sign language gestures, which can provide systems with multimodal interaction; or consist in multimodal tools to help psycholinguists to understand new aspects of discourse analysis and to automate laborious tasks. Gestures are characterized by several aspects, mainly by movements and sequence of postures. Since data referring to movements or sequences carry temporal information, this paper presents a literature review about temporal aspects of hand gesture analysis, focusing on applications related to natural conversation and psycholinguistic analysis, using Systematic Literature Review methodology. In our results, we organized works according to type of analysis, methods, highlighting the use of Machine Learning techniques, and applications.