Vinícius Segura

h-index6

7papers

10citations

Novelty21%

AI Score26

Ranked #163,406 of 194,257 authors (top 84%)#9,934 in AI (top 79%)

7 Papers

2.0LGMar 9, 2023

Position Paper on Dataset Engineering to Accelerate Science

Emilio Vital Brazil, Eduardo Soares, Lucas Villa Real et al.

Data is a critical element in any discovery process. In the last decades, we observed exponential growth in the volume of available data and the technology to manipulate it. However, data is only practical when one can structure it for a well-defined task. For instance, we need a corpus of text broken into sentences to train a natural language machine-learning model. In this work, we will use the token \textit{dataset} to designate a structured set of data built to perform a well-defined task. Moreover, the dataset will be used in most cases as a blueprint of an entity that at any moment can be stored as a table. Specifically, in science, each area has unique forms to organize, gather and handle its datasets. We believe that datasets must be a first-class entity in any knowledge-intensive process, and all workflows should have exceptional attention to datasets' lifecycle, from their gathering to uses and evolution. We advocate that science and engineering discovery processes are extreme instances of the need for such organization on datasets, claiming for new approaches and tooling. Furthermore, these requirements are more evident when the discovery workflow uses artificial intelligence methods to empower the subject-matter expert. In this work, we discuss an approach to bringing datasets as a critical entity in the discovery process in science. We illustrate some concepts using material discovery as a use case. We chose this domain because it leverages many significant problems that can be generalized to other science fields.

3.9AIMar 9, 2023

Knowledge-augmented Risk Assessment (KaRA): a hybrid-intelligence framework for supporting knowledge-intensive risk assessment of prospect candidates

Carlos Raoni Mendes, Emilio Vital Brazil, Vinicius Segura et al.

Evaluating the potential of a prospective candidate is a common task in multiple decision-making processes in different industries. We refer to a prospect as something or someone that could potentially produce positive results in a given context, e.g., an area where an oil company could find oil, a compound that, when synthesized, results in a material with required properties, and so on. In many contexts, assessing the Probability of Success (PoS) of prospects heavily depends on experts' knowledge, often leading to biased and inconsistent assessments. We have developed the framework named KARA (Knowledge-augmented Risk Assessment) to address these issues. It combines multiple AI techniques that consider SMEs (Subject Matter Experts) feedback on top of a structured domain knowledge-base to support risk assessment processes of prospect candidates in knowledge-intensive contexts.

1.8LGNov 5, 2022

Toward Human-AI Co-creation to Accelerate Material Discovery

Dmitry Zubarev, Carlos Raoni Mendes, Emilio Vital Brazil et al.

There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several recent advances in Machine Learning and AI to address some of these challenges, there is still a gap in technologies to support end-to-end discovery applications, integrating the myriad of available technologies into a coherent, orchestrated, yet flexible discovery process. Such applications need to handle complex knowledge management at scale, enabling knowledge consumption and production in a timely and efficient way for subject matter experts (SMEs). Furthermore, the discovery of novel functional materials strongly relies on the development of exploration strategies in the chemical space. For instance, generative models have gained attention within the scientific community due to their ability to generate enormous volumes of novel molecules across material domains. These models exhibit extreme creativity that often translates in low viability of the generated candidates. In this work, we propose a workbench framework that aims at enabling the human-AI co-creation to reduce the time until the first discovery and the opportunity costs involved. This framework relies on a knowledge base with domain and process knowledge, and user-interaction components to acquire knowledge and advise the SMEs. Currently,the framework supports four main activities: generative modeling, dataset triage, molecule adjudication, and risk assessment.

2.1AIApr 11, 2023

Human-AI Co-Creation Approach to Find Forever Chemicals Replacements

Juliana Jansen Ferreira, Vinícius Segura, Joana G. R. Souza et al.

Generative models are a powerful tool in AI for material discovery. We are designing a software framework that supports a human-AI co-creation process to accelerate finding replacements for the ``forever chemicals''-- chemicals that enable our modern lives, but are harmful to the environment and the human health. Our approach combines AI capabilities with the domain-specific tacit knowledge of subject matter experts to accelerate the material discovery. Our co-creation process starts with the interaction between the subject matter experts and a generative model that can generate new molecule designs. In this position paper, we discuss our hypothesis that these subject matter experts can benefit from a more iterative interaction with the generative model, asking for smaller samples and ``guiding'' the exploration of the discovery space with their knowledge.

4.1HCSep 24, 2025

How People Manage Knowledge in their "Second Brains"- A Case Study with Industry Researchers Using Obsidian

Juliana Jansen Ferreira, Vinícius Segura, Joana Gabriela Souza et al.

People face overwhelming information during work activities, necessitating effective organization and management strategies. Even in personal lives, individuals must keep, annotate, organize, and retrieve knowledge from daily routines. The collection of records for future reference is known as a personal knowledge base. Note-taking applications are valuable tools for building and maintaining these bases, often called a ''second brain''. This paper presents a case study on how people build and explore personal knowledge bases for various purposes. We selected the note-taking tool Obsidian and researchers from a Brazilian lab for an in-depth investigation. Our investigation reveals interesting findings about how researchers build and explore their personal knowledge bases. A key finding is that participants' knowledge retrieval strategy influences how they build and maintain their content. We suggest potential features for an AI system to support this process.

2.7HCJan 31, 2024

Making Sense of Knowledge Intensive Processes: an Oil & Gas Industry Scenario

Juliana Jansen Ferreira, Vinícius Segura, Ana Fucs et al.

Sensemaking is a constant and ongoing process by which people associate meaning to experiences. It can be an individual process, known as abduction, or a group process by which people give meaning to collective experiences. The sensemaking of a group is influenced by the abduction process of each person about the experience. Every collaborative process needs some level of sensemaking to show results. For a knowledge intensive process, sensemaking is central and related to most of its tasks. We present findings from a fieldwork executed in knowledge intensive process from the Oil and Gas industry. Our findings indicated that different types of knowledge can be combined to compose the result of a sensemaking process (e.g. decision, the need for more discussion, etc.). This paper presents an initial set of knowledge types that can be combined to compose the result of the sensemaking of a collaborative decision making process. We also discuss ideas for using systems powered by Artificial Intelligence to support sensemaking processes.

3.6SEDec 22, 2021

DevOps and Microservices in Scientific System development

Maximillien de Bayser, Vinicius Segura, Leonardo Guerreiro Azevedo et al.

There is a gap in scientific information systems development concerning modern software engineering and scientific computing. Historically, software engineering methodologies have been perceived as an unwanted accidental complexity to computational scientists in their scientific systems development. More recent trends, like the end of Moore's law and the subsequent diversification of hardware platforms, combined with the increasing multidisciplinarity of science itself have exacerbated the problem because self-taught "end user developers" are not familiar with the disciplines needed to tackle this increased complexity. On a more positive note, agile programming methods have approached software development practices to the way scientific software is produced. In this work, we present the experience of a multi-year industry research project where agile methods, microservices and DevOps were applied. Our goal is to validate the hypothesis that the use of microservices would allow computational scientists to work in the more minimalistic prototype-oriented way that they prefer while the software engineering team would handle the integration. Hence, scientific multidisciplinary systems would gain in a twofold way: (i) Subject Matter Experts(SME) use their preferable tools to develop the specific scientific part of the system; (ii) software engineers provide the high quality software code for the system delivery.