Danilo Ribeiro

CL
h-index14
5papers
812citations
Novelty40%
AI Score41

5 Papers

CLMay 18, 2022
Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner

Danilo Ribeiro, Shen Wang, Xiaofei Ma et al. · amazon-science

Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain and inspect a QA system's answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.

CLFeb 13, 2023
STREET: A Multi-Task Structured Reasoning and Explanation Benchmark

Danilo Ribeiro, Shen Wang, Xiaofei Ma et al. · amazon-science

We introduce STREET, a unified multi-task and multi-domain natural language reasoning and explanation benchmark. Unlike most existing question-answering (QA) datasets, we expect models to not only answer questions, but also produce step-by-step structured explanations describing how premises in the question are used to produce intermediate conclusions that can prove the correctness of a certain answer. We perform extensive evaluation with popular language models such as few-shot prompting GPT-3 and fine-tuned T5. We find that these models still lag behind human performance when producing such structured reasoning steps. We believe this work will provide a way for the community to better train and test systems on multi-step reasoning and explanations in natural language.

SEMay 18
One Developer Is All You Need: A Case Study of an AI-Augmented One-Person Squad in a Brownfield Enterprise

Marcelo Vilas Boas, Gustavo Pinto, Edward Roberto Monteiro et al.

AI tools are enabling engineers to absorb roles previously distributed across cross-functional squads, yet there is little structured evidence on how to design or evaluate such a one-person squad in a regulated enterprise setting. Without that evidence, organizations adopting this model lack guidance on which design decisions make it viable and which conditions cause it to break down. We report a case study in which a single staff engineer, supported by four AI agents under a Spec-Driven Development workflow, delivered a brownfield product initiative scoped for a four-person squad in half the planned time, with 90\% acceptance of AI-generated code on first review, full integration test pass rates, and an above-85\% reduction in direct staffing cost. The results indicate that AI does not replace team members it multiplies the throughput of the experienced engineer who remains, making specification quality and institutional knowledge, not model capability, the binding constraints on one-person squad success.

SEMay 29, 2025
Toward Effective AI Governance: A Review of Principles

Danilo Ribeiro, Thayssa Rocha, Gustavo Pinto et al.

Artificial Intelligence (AI) governance is the practice of establishing frameworks, policies, and procedures to ensure the responsible, ethical, and safe development and deployment of AI systems. Although AI governance is a core pillar of Responsible AI, current literature still lacks synthesis across such governance frameworks and practices. Objective: To identify which frameworks, principles, mechanisms, and stakeholder roles are emphasized in secondary literature on AI governance. Method: We conducted a rapid tertiary review of nine peer-reviewed secondary studies from IEEE and ACM (20202024), using structured inclusion criteria and thematic semantic synthesis. Results: The most cited frameworks include the EU AI Act and NIST RMF; transparency and accountability are the most common principles. Few reviews detail actionable governance mechanisms or stakeholder strategies. Conclusion: The review consolidates key directions in AI governance and highlights gaps in empirical validation and inclusivity. Findings inform both academic inquiry and practical adoption in organizations.

CLMay 5, 2023
Towards Zero-Shot Frame Semantic Parsing with Task Agnostic Ontologies and Simple Labels

Danilo Ribeiro, Omid Abdar, Jack Goetz et al.

Frame semantic parsing is an important component of task-oriented dialogue systems. Current models rely on a significant amount training data to successfully identify the intent and slots in the user's input utterance. This creates a significant barrier for adding new domains to virtual assistant capabilities, as creation of this data requires highly specialized NLP expertise. In this work we propose OpenFSP, a framework that allows for easy creation of new domains from a handful of simple labels that can be generated without specific NLP knowledge. Our approach relies on creating a small, but expressive, set of domain agnostic slot types that enables easy annotation of new domains. Given such annotation, a matching algorithm relying on sentence encoders predicts the intent and slots for domains defined by end-users. Extensive experiments on the TopV2 dataset shows that our model outperforms strong baselines in this simple labels setting.