AIJun 25, 2023Code
A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-AttentionMarcelo Archanjo Jose, Fabio Gagliardi Cozman
Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transformers with up to 512 input tokens. We propose a training process with database schema pruning (removal of tables and columns names that are useless for the query of interest). In addition, we used a multilingual approach with the mT5-large model fine-tuned with a data-augmented Spider dataset in four languages simultaneously: English, Portuguese, Spanish, and French. Our proposed technique used the Spider dataset and increased the exact set match accuracy results from 0.718 to 0.736 in a validation dataset (Dev). Source code, evaluations, and checkpoints are available at: \underline{https://github.com/C4AI/gap-text2sql}.
AIAug 5, 2023
dPASP: A Comprehensive Differentiable Probabilistic Answer Set Programming Environment For Neurosymbolic Learning and ReasoningRenato Lui Geh, Jonas Gonçalves, Igor Cataneo Silveira et al.
We present dPASP, a novel declarative probabilistic logic programming framework for differentiable neuro-symbolic reasoning. The framework allows for the specification of discrete probabilistic models with neural predicates, logic constraints and interval-valued probabilistic choices, thus supporting models that combine low-level perception (images, texts, etc), common-sense reasoning, and (vague) statistical knowledge. To support all such features, we discuss the several semantics for probabilistic logic programs that can express nondeterministic, contradictory, incomplete and/or statistical knowledge. We also discuss how gradient-based learning can be performed with neural predicates and probabilistic choices under selected semantics. We then describe an implemented package that supports inference and learning in the language, along with several example programs. The package requires minimal user knowledge of deep learning system's inner workings, while allowing end-to-end training of rather sophisticated models and loss functions.
FLU-DYNJan 18, 2023
Augmenting a Physics-Informed Neural Network for the 2D Burgers Equation by Addition of Solution Data PointsMarlon Sproesser Mathias, Wesley Pereira de Almeida, Marcel Rodrigues de Barros et al.
We implement a Physics-Informed Neural Network (PINN) for solving the two-dimensional Burgers equations. This type of model can be trained with no previous knowledge of the solution; instead, it relies on evaluating the governing equations of the system in points of the physical domain. It is also possible to use points with a known solution during training. In this paper, we compare PINNs trained with different amounts of governing equation evaluation points and known solution points. Comparing models that were trained purely with known solution points to those that have also used the governing equations, we observe an improvement in the overall observance of the underlying physics in the latter. We also investigate how changing the number of each type of point affects the resulting models differently. Finally, we argue that the addition of the governing equations during training may provide a way to improve the overall performance of the model without relying on additional data, which is especially important for situations where the number of known solution points is limited.
AIFeb 27, 2023
Markov Conditions and Factorization in Logical Credal NetworksFabio Gagliardi Cozman
We examine the recently proposed language of Logical Credal Networks, in particular investigating the consequences of various Markov conditions. We introduce the notion of structure for a Logical Credal Network and show that a structure without directed cycles leads to a well-known factorization result. For networks with directed cycles, we analyze the differences between Markov conditions, factorization results, and specification requirements.
CLOct 7, 2021Code
mRAT-SQL+GAP:A Portuguese Text-to-SQL TransformerMarcelo Archanjo José, Fabio Gagliardi Cozman
The translation of natural language questions to SQL queries has attracted growing attention, in particular in connection with transformers and similar language models. A large number of techniques are geared towards the English language; in this work, we thus investigated translation to SQL when input questions are given in the Portuguese language. To do so, we properly adapted state-of-the-art tools and resources. We changed the RAT-SQL+GAP system by relying on a multilingual BART model (we report tests with other language models), and we produced a translated version of the Spider dataset. Our experiments expose interesting phenomena that arise when non-English languages are targeted; in particular, it is better to train with original and translated training datasets together, even if a single target language is desired. This multilingual BART model fine-tuned with a double-size training dataset (English and Portuguese) achieved 83% of the baseline, making inferences for the Portuguese test dataset. This investigation can help other researchers to produce results in Machine Learning in a language different from English. Our multilingual ready version of RAT-SQL+GAP and the data are available, open-sourced as mRAT-SQL+GAP at: https://github.com/C4AI/gap-text2sql
MLSep 2, 2025
Probabilities of Causation and Root Cause Analysis with Quasi-Markovian ModelsEduardo Rocha Laurentino, Fabio Gagliardi Cozman, Denis Deratani Maua et al.
Probabilities of causation provide principled ways to assess causal relationships but face computational challenges due to partial identifiability and latent confounding. This paper introduces both algorithmic simplifications, significantly reducing the computational complexity of calculating tighter bounds for these probabilities, and a novel methodological framework for Root Cause Analysis that systematically employs these causal metrics to rank entire causal paths.
CLDec 8, 2024
Infusing Prompts with Syntax and SemanticsAnton Bulle Labate, Fabio Gagliardi Cozman
Despite impressive success, language models often generate outputs with flawed linguistic structure. We analyze the effect of directly infusing various kinds of syntactic and semantic information into large language models. To demonstrate the value of our proposals, we focus on the translation of natural language queries to SQL, in particular dealing with languages with less resources than English, to better investigate how much help we can get from low cost syntactic and semantic information. We show that linguistic analysis can significantly boost language models, to the point that we have surpassed previous best systems.
CLJun 21, 2024
Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated TasksVictor Hugo Nascimento Rocha, Igor Cataneo Silveira, Paulo Pirozelli et al.
The recent success of Large Language Models (LLMs) has sparked concerns about their potential to spread misinformation. As a result, there is a pressing need for tools to identify ``fake arguments'' generated by such models. To create these tools, examples of texts generated by LLMs are needed. This paper introduces a methodology to obtain good, bad and ugly arguments from argumentative essays produced by ChatGPT, OpenAI's LLM. We then describe a novel dataset containing a set of diverse arguments, ArGPT. We assess the effectiveness of our dataset and establish baselines for several argumentation-related tasks. Finally, we show that the artificially generated data relates well to human argumentation and thus is useful as a tool to train and test systems for the defined tasks.
CLMar 5, 2020
An Empirical Accuracy Law for Sequential Machine Translation: the Case of Google TranslateLucas Nunes Sequeira, Bruno Moreschi, Fabio Gagliardi Cozman et al.
In this research, we have established, through empirical testing, a law that relates the number of translating hops to translation accuracy in sequential machine translation in Google Translate. Both accuracy and size decrease with the number of hops; the former displays a decrease closely following a power law. Such a law allows one to predict the behavior of translation chains that may be built as society increasingly depends on automated devices.
CLOct 22, 2018
A Fully Attention-Based Information RetrieverAlvaro Henrique Chaim Correia, Jorge Luiz Moreira Silva, Thiago de Castro Martins et al.
Recurrent neural networks are now the state-of-the-art in natural language processing because they can build rich contextual representations and process texts of arbitrary length. However, recent developments on attention mechanisms have equipped feedforward networks with similar capabilities, hence enabling faster computations due to the increase in the number of operations that can be parallelized. We explore this new type of architecture in the domain of question-answering and propose a novel approach that we call Fully Attention Based Information Retriever (FABIR). We show that FABIR achieves competitive results in the Stanford Question Answering Dataset (SQuAD) while having fewer parameters and being faster at both learning and inference than rival methods.
AIJun 20, 2018
Interpreting Embedding Models of Knowledge Bases: A Pedagogical ApproachArthur Colombini Gusmão, Alvaro Henrique Chaim Correia, Glauber De Bona et al.
Knowledge bases are employed in a variety of applications from natural language processing to semantic web search; alas, in practice their usefulness is hurt by their incompleteness. Embedding models attain state-of-the-art accuracy in knowledge base completion, but their predictions are notoriously hard to interpret. In this paper, we adapt "pedagogical approaches" (from the literature on neural networks) so as to interpret embedding models by extracting weighted Horn rules from them. We show how pedagogical approaches have to be adapted to take upon the large-scale relational aspects of knowledge bases and show experimentally their strengths and weaknesses.
AIJan 31, 2017
On the Semantics and Complexity of Probabilistic Logic ProgramsFabio Gagliardi Cozman, Denis Deratani Mauá
We examine the meaning and the complexity of probabilistic logic programs that consist of a set of rules and a set of independent probabilistic facts (that is, programs based on Sato's distribution semantics). We focus on two semantics, respectively based on stable and on well-founded models. We show that the semantics based on stable models (referred to as the "credal semantics") produces sets of probability models that dominate infinitely monotone Choquet capacities, we describe several useful consequences of this result. We then examine the complexity of inference with probabilistic logic programs. We distinguish between the complexity of inference when a probabilistic program and a query are given (the inferential complexity), and the complexity of inference when the probabilistic program is fixed and the query is given (the query complexity, akin to data complexity as used in database theory). We obtain results on the inferential and query complexity for acyclic, stratified, and cyclic propositional and relational programs, complexity reaches various levels of the counting hierarchy and even exponential levels.
AIDec 4, 2016
The Complexity of Bayesian Networks Specified by Propositional and Relational LanguagesFabio Gagliardi Cozman, Denis Deratani Mauá
We examine the complexity of inference in Bayesian networks specified by logical languages. We consider representations that range from fragments of propositional logic to function-free first-order logic with equality; in doing so we cover a variety of plate models and of probabilistic relational models. We study the complexity of inferences when network, query and domain are the input (the inferential and the combined complexity), when the network is fixed and query and domain are the input (the query/data complexity), and when the network and query are fixed and the domain is the input (the domain complexity). We draw connections with probabilistic databases and liftability results, and obtain complexity classes that range from polynomial to exponential levels.
AISep 12, 2016
First-Order Bayesian Network Specifications Capture the Complexity Class PPFabio Gagliardi Cozman
The point of this note is to prove that a language is in the complexity class PP if and only if the strings of the language encode valid inferences in a Bayesian network defined using function-free first-order logic with equality.
AIFeb 13, 2013
Quasi-Bayesian Strategies for Efficient Plan Generation: Application to the Planning to Observe ProblemFabio Gagliardi Cozman, Eric Krotkov
Quasi-Bayesian theory uses convex sets of probability distributions and expected loss to represent preferences about plans. The theory focuses on decision robustness, i.e., the extent to which plans are affected by deviations in subjective assessments of probability. The present work presents solutions for plan generation when robustness of probability assessments must be included: plans contain information about the robustness of certain actions. The surprising result is that some problems can be solved faster in the Quasi-Bayesian framework than within usual Bayesian theory. We investigate this on the planning to observe problem, i.e., an agent must decide whether to take new observations or not. The fundamental question is: How, and how much, to search for a "best" plan, based on the robustness of probability assessments? Plan generation algorithms are derived in the context of material classification with an acoustic robotic probe. A package that constructs Quasi-Bayesian plans is available through anonymous ftp.
AIFeb 6, 2013
Robustness Analysis of Bayesian Networks with Local Convex Sets of DistributionsFabio Gagliardi Cozman
Robust Bayesian inference is the calculation of posterior probability bounds given perturbations in a probabilistic model. This paper focuses on perturbations that can be expressed locally in Bayesian networks through convex sets of distributions. Two approaches for combination of local models are considered. The first approach takes the largest set of joint distributions that is compatible with the local sets of distributions; we show how to reduce this type of robust inference to a linear programming problem. The second approach takes the convex hull of joint distributions generated from the local sets of distributions; we demonstrate how to apply interior-point optimization methods to generate posterior bounds and how to generate approximations that are guaranteed to converge to correct posterior bounds. We also discuss calculation of bounds for expected utilities and variances, and global perturbation models.
AIJan 30, 2013
Irrelevance and Independence Relations in Quasi-Bayesian NetworksFabio Gagliardi Cozman
This paper analyzes irrelevance and independence relations in graphical models associated with convex sets of probability distributions (called Quasi-Bayesian networks). The basic question in Quasi-Bayesian networks is, How can irrelevance/independence relations in Quasi-Bayesian networks be detected, enforced and exploited? This paper addresses these questions through Walley's definitions of irrelevance and independence. Novel algorithms and results are presented for inferences with the so-called natural extensions using fractional linear programming, and the properties of the so-called type-1 extensions are clarified through a new generalization of d-separation.
AIJan 16, 2013
Separation Properties of Sets of Probability MeasuresFabio Gagliardi Cozman
This paper analyzes independence concepts for sets of probability measures associated with directed acyclic graphs. The paper shows that epistemic independence and the standard Markov condition violate desirable separation properties. The adoption of a contraction condition leads to d-separation but still fails to guarantee a belief separation property. To overcome this unsatisfactory situation, a strong Markov condition is proposed, based on epistemic independence. The main result is that the strong Markov condition leads to strong independence and does enforce separation properties; this result implies that (1) separation properties of Bayesian networks do extend to epistemic independence and sets of probability measures, and (2) strong independence has a clear justification based on epistemic independence and the strong Markov condition.
AIOct 19, 2012
Inference in Polytrees with Sets of ProbabilitiesJose Carlos Ferreira da Rocha, Fabio Gagliardi Cozman, Cassio Polpo de Campos
Inferences in directed acyclic graphs associated with probability sets and probability intervals are NP-hard, even for polytrees. In this paper we focus on such inferences, and propose: 1) a substantial improvement on Tessems A / R algorithm FOR polytrees WITH probability intervals; 2) a new algorithm FOR direction - based local search(IN sets OF probability) that improves ON existing methods; 3) a collection OF branch - AND - bound algorithms that combine the previous techniques.The first two techniques lead TO approximate solutions, WHILE branch - AND - bound procedures can produce either exact OR approximate solutions.We report ON dramatic improvements ON existing techniques FOR inference WITH probability sets AND intervals, IN SOME cases reducing the computational effort BY many orders OF magnitude.
AIJul 11, 2012
Propositional and Relational Bayesian Networks Associated with Imprecise and Qualitative Probabilistic AssesmentsFabio Gagliardi Cozman, Cassio Polpo de Campos, Jaime Ide et al.
This paper investigates a representation language with flexibility inspired by probabilistic logic and compactness inspired by relational Bayesian networks. The goal is to handle propositional and first-order constructs together with precise, imprecise, indeterminate and qualitative probabilistic assessments. The paper shows how this can be achieved through the theory of credal networks. New exact and approximate inference algorithms based on multilinear programming and iterated/loopy propagation of interval probabilities are presented; their superior performance, compared to existing ones, is shown empirically.
AIJul 4, 2012
Belief Updating and Learning in Semi-Qualitative Probabilistic NetworksCassio Polpo de Campos, Fabio Gagliardi Cozman
This paper explores semi-qualitative probabilistic networks (SQPNs) that combine numeric and qualitative information. We first show that exact inferences with SQPNs are NPPP-Complete. We then show that existing qualitative relations in SQPNs (plus probabilistic logic and imprecise assessments) can be dealt effectively through multilinear programming. We then discuss learning: we consider a maximum likelihood method that generates point estimates given a SQPN and empirical data, and we describe a Bayesian-minded method that employs the Imprecise Dirichlet Model to generate set-valued estimates.
AIMay 9, 2012
Complexity Analysis and Variational Inference for Interpretation-based Probabilistic Description LogicFabio Gagliardi Cozman, Rodrigo Bellizia Polastro
This paper presents complexity analysis and variational methods for inference in probabilistic description logics featuring Boolean operators, quantification, qualified number restrictions, nominals, inverse roles and role hierarchies. Inference is shown to be PEXP-complete, and variational methods are designed so as to exploit logical inference whenever possible.