Luciano Serafini

h-index50

45papers

1,338citations

Novelty44%

AI Score55

Ranked #23,467 of 201,326 authors (top 12%)#895 in AI (top 6%)

45 Papers

LGJun 3

FLAGG: Flexible Autoregressive Graph Generation

Samuel Cognolato, Alessandro Sperduti, Luciano Serafini

The Deep Graph Generation's panorama spans two extremes: one-shot and sequential models. The former generates nodes and edges jointly, while the latter samples them autoregressively. Each method performs better in different graph domains depending on size and topology, but neither is applicable to all graph categories. For instance, one-shot methods struggle with generating large graphs, while sequential methods underperform on smaller graphs. A possible way to overcome these limitations is to flexibly combine the two methods in a unique system. In this work, we propose the FLAGG (Flexible Autoregressive Graph Generation) framework, which sequentially generates portions of graphs with one-shot models. FLAGG can apply any one-shot model to make it autoregressive, allowing flexibility in choosing the sequential policy. This policy is specified through a stochastic node removal process, which an Insertion Model learns to reverse. We evaluate FLAGG with the DiGress one-shot model on several data sets of different graph sizes and domains. We show that the approach outperforms both one-shot and autoregressive baselines in terms of sampling quality.

AIJun 10, 2022

Refining neural network predictions using background knowledge

Alessandro Daniele, Emile van Krieken, Luciano Serafini et al.

Recent work has shown logical background knowledge can be used in learning systems to compensate for a lack of labeled training data. Many methods work by creating a loss function that encodes this knowledge. However, often the logic is discarded after training, even if it is still useful at test time. Instead, we ensure neural network predictions satisfy the knowledge by refining the predictions with an extra computation step. We introduce differentiable refinement functions that find a corrected prediction close to the original prediction. We study how to effectively and efficiently compute these refinement functions. Using a new algorithm called Iterative Local Refinement (ILR), we combine refinement functions to find refined predictions for logical formulas of any complexity. ILR finds refinements on complex SAT formulas in significantly fewer iterations and frequently finds solutions where gradient descent can not. Finally, ILR produces competitive results in the MNIST addition task.

CVMar 4, 2022

Online Learning of Reusable Abstract Models for Object Goal Navigation

Tommaso Campari, Leonardo Lamanna, Paolo Traverso et al.

In this paper, we present a novel approach to incrementally learn an Abstract Model of an unknown environment, and show how an agent can reuse the learned model for tackling the Object Goal Navigation task. The Abstract Model is a finite state machine in which each state is an abstraction of a state of the environment, as perceived by the agent in a certain position and orientation. The perceptions are high-dimensional sensory data (e.g., RGB-D images), and the abstraction is reached by exploiting image segmentation and the Taskonomy model bank. The learning of the Abstract Model is accomplished by executing actions, observing the reached state, and updating the Abstract Model with the acquired information. The learned models are memorized by the agent, and they are reused whenever it recognizes to be in an environment that corresponds to the stored model. We investigate the effectiveness of the proposed approach for the Object Goal Navigation task, relying on public benchmarks. Our results show that the reuse of learned Abstract Models can boost performance on Object Goal Navigation.

LGAug 24, 2022

Deep Symbolic Learning: Discovering Symbols and Rules from Perceptions

Alessandro Daniele, Tommaso Campari, Sagar Malhotra et al.

Neuro-Symbolic (NeSy) integration combines symbolic reasoning with Neural Networks (NNs) for tasks requiring perception and reasoning. Most NeSy systems rely on continuous relaxation of logical knowledge, and no discrete decisions are made within the model pipeline. Furthermore, these methods assume that the symbolic rules are given. In this paper, we propose Deep Symbolic Learning (DSL), a NeSy system that learns NeSy-functions, i.e., the composition of a (set of) perception functions which map continuous data to discrete symbols, and a symbolic function over the set of symbols. DSL learns simultaneously the perception and symbolic functions while being trained only on their composition (NeSy-function). The key novelty of DSL is that it can create internal (interpretable) symbolic representations and map them to perception inputs within a differentiable NN learning pipeline. The created symbols are automatically selected to generate symbolic functions that best explain the data. We provide experimental analysis to substantiate the efficacy of DSL in simultaneously learning perception and symbolic functions.

CVDec 1, 2022

Exploiting Proximity-Aware Tasks for Embodied Social Navigation

Enrico Cancelli, Tommaso Campari, Luciano Serafini et al.

Learning how to navigate among humans in an occluded and spatially constrained indoor environment, is a key ability required to embodied agent to be integrated into our society. In this paper, we propose an end-to-end architecture that exploits Proximity-Aware Tasks (referred as to Risk and Proximity Compass) to inject into a reinforcement learning navigation policy the ability to infer common-sense social behaviors. To this end, our tasks exploit the notion of immediate and future dangers of collision. Furthermore, we propose an evaluation protocol specifically designed for the Social Navigation Task in simulated environments. This is done to capture fine-grained features and characteristics of the policy by analyzing the minimal unit of human-robot spatial interaction, called Encounter. We validate our approach on Gibson4+ and Habitat-Matterport3D datasets.

AIJan 15, 2023

Planning for Learning Object Properties

Leonardo Lamanna, Luciano Serafini, Mohamadreza Faridghasemnia et al.

Autonomous agents embedded in a physical environment need the ability to recognize objects and their properties from sensory data. Such a perceptual ability is often implemented by supervised machine learning models, which are pre-trained using a set of labelled data. In real-world, open-ended deployments, however, it is unrealistic to assume to have a pre-trained model for all possible environments. Therefore, agents need to dynamically learn/adapt/extend their perceptual abilities online, in an autonomous way, by exploring and interacting with the environment where they operate. This paper describes a way to do so, by exploiting symbolic planning. Specifically, we formalize the problem of automatically training a neural network to recognize object properties as a symbolic planning problem (using PDDL). We use planning techniques to produce a strategy for automating the training dataset creation and the learning process. Finally, we provide an experimental evaluation in both a simulated and a real environment, which shows that the proposed approach is able to successfully learn how to recognize new object properties.

LGMay 31, 2022

Knowledge Enhanced Neural Networks for relational domains

Alessandro Daniele, Luciano Serafini

In the recent past, there has been a growing interest in Neural-Symbolic Integration frameworks, i.e., hybrid systems that integrate connectionist and symbolic approaches to obtain the best of both worlds. In this work we focus on a specific method, KENN (Knowledge Enhanced Neural Networks), a Neural-Symbolic architecture that injects prior logical knowledge into a neural network by adding on its top a residual layer that modifies the initial predictions accordingly to the knowledge. Among the advantages of this strategy, there is the inclusion of clause weights, learnable parameters that represent the strength of the clauses, meaning that the model can learn the impact of each rule on the final predictions. As a special case, if the training data contradicts a constraint, KENN learns to ignore it, making the system robust to the presence of wrong knowledge. In this paper, we propose an extension of KENN for relational data. One of the main advantages of KENN resides in its scalability, thanks to a flexible treatment of dependencies between the rules obtained by stacking multiple logical layers. We show experimentally the efficacy of this strategy. The results show that KENN is capable of increasing the performances of the underlying neural network, obtaining better or comparable accuracies in respect to other two related methods that combine learning with logic, requiring significantly less time for learning.

AIApr 8, 2022

On Projectivity in Markov Logic Networks

Sagar Malhotra, Luciano Serafini

Markov Logic Networks (MLNs) define a probability distribution on relational structures over varying domain sizes. Many works have noticed that MLNs, like many other relational models, do not admit consistent marginal inference over varying domain sizes. Furthermore, MLNs learnt on a certain domain do not generalize to new domains of varied sizes. In recent works, connections have emerged between domain size dependence, lifted inference and learning from sub-sampled domains. The central idea to these works is the notion of projectivity. The probability distributions ascribed by projective models render the marginal probabilities of sub-structures independent of the domain cardinality. Hence, projective models admit efficient marginal inference, removing any dependence on the domain size. Furthermore, projective models potentially allow efficient and consistent parameter learning from sub-sampled domains. In this paper, we characterize the necessary and sufficient conditions for a two-variable MLN to be projective. We then isolate a special model in this class of MLNs, namely Relational Block Model (RBM). We show that, in terms of data likelihood maximization, RBM is the best possible projective MLN in the two-variable fragment. Finally, we show that RBMs also admit consistent parameter learning over sub-sampled domains.

AIAug 22, 2023

Lifted Inference beyond First-Order Logic

Sagar Malhotra, Davide Bizzaro, Luciano Serafini

Weighted First Order Model Counting (WFOMC) is fundamental to probabilistic inference in statistical relational learning models. As WFOMC is known to be intractable in general ($\#$P-complete), logical fragments that admit polynomial time WFOMC are of significant interest. Such fragments are called domain liftable. Recent works have shown that the two-variable fragment of first order logic extended with counting quantifiers ($\mathrm{C^2}$) is domain-liftable. However, many properties of real-world data, like acyclicity in citation networks and connectivity in social networks, cannot be modeled in $\mathrm{C^2}$, or first order logic in general. In this work, we expand the domain liftability of $\mathrm{C^2}$ with multiple such properties. We show that any $\mathrm{C^2}$ sentence remains domain liftable when one of its relations is restricted to represent a directed acyclic graph, a connected graph, a tree (resp. a directed tree) or a forest (resp. a directed forest). All our results rely on a novel and general methodology of "counting by splitting". Besides their application to probabilistic inference, our results provide a general framework for counting combinatorial structures. We expand a vast array of previous results in discrete mathematics literature on directed acyclic graphs, phylogenetic networks, etc.

AIFeb 20, 2023

Weighted First Order Model Counting with Directed Acyclic Graph Axioms

Sagar Malhotra, Luciano Serafini

Statistical Relational Learning (SRL) integrates First-Order Logic (FOL) and probability theory for learning and inference over relational data. Probabilistic inference and learning in many SRL models can be reduced to Weighted First Order Model Counting (WFOMC). However, WFOMC is known to be intractable ($\mathrm{\#P_1-}$ complete). Hence, logical fragments that admit polynomial time WFOMC are of significant interest. Such fragments are called domain liftable. Recent line of works have shown the two-variable fragment of FOL, extended with counting quantifiers ($\mathrm{C^2}$) to be domain-liftable. However, many properties of real-world data can not be modelled in $\mathrm{C^2}$. In fact many ubiquitous properties of real-world data are inexressible in FOL. Acyclicity is one such property, found in citation networks, genealogy data, temporal data e.t.c. In this paper we aim to address this problem by investigating the domain liftability of directed acyclicity constraints. We show that the fragment $\mathrm{C^2}$ with a Directed Acyclic Graph (DAG) axiom, i.e., a predicate in the language is axiomatized to represent a DAG, is domain-liftable. We present a method based on principle of inclusion-exclusion for WFOMC of $\mathrm{C^2}$ formulas extended with DAG axioms.

AIJun 26, 2023

logLTN: Differentiable Fuzzy Logic in the Logarithm Space

Samy Badreddine, Luciano Serafini, Michael Spranger

The AI community is increasingly focused on merging logic with deep learning to create Neuro-Symbolic (NeSy) paradigms and assist neural approaches with symbolic knowledge. A significant trend in the literature involves integrating axioms and facts in loss functions by grounding logical symbols with neural networks and operators with fuzzy semantics. Logic Tensor Networks (LTN) is one of the main representatives in this category, known for its simplicity, efficiency, and versatility. However, it has been previously shown that not all fuzzy operators perform equally when applied in a differentiable setting. Researchers have proposed several configurations of operators, trading off between effectiveness, numerical stability, and generalization to different formulas. This paper presents a configuration of fuzzy operators for grounding formulas end-to-end in the logarithm space. Our goal is to develop a configuration that is more effective than previous proposals, able to handle any formula, and numerically stable. To achieve this, we propose semantics that are best suited for the logarithm space and introduce novel simplifications and improvements that are crucial for optimization via gradient-descent. We use LTN as the framework for our experiments, but the conclusions of our work apply to any similar NeSy framework. Our findings, both formal and empirical, show that the proposed configuration outperforms the state-of-the-art and that each of our modifications is essential in achieving these results.

LGAug 23, 2024

IFH: a Diffusion Framework for Flexible Design of Graph Generative Models

Samuel Cognolato, Alessandro Sperduti, Luciano Serafini

Graph generative models can be classified into two prominent families: one-shot models, which generate a graph in one go, and sequential models, which generate a graph by successive additions of nodes and edges. Ideally, between these two extreme models lies a continuous range of models that adopt different levels of sequentiality. This paper proposes a graph generative model, called Insert-Fill-Halt (IFH), that supports the specification of a sequentiality degree. IFH is based upon the theory of Denoising Diffusion Probabilistic Models (DDPM), designing a node removal process that gradually destroys a graph. An insertion process learns to reverse this removal process by inserting arcs and nodes according to the specified sequentiality degree. We evaluate the performance of IFH in terms of quality, run time, and memory, depending on different sequentiality degrees. We also show that using DiGress, a diffusion-based one-shot model, as a generative step in IFH leads to improvement to the model itself, and is competitive with the current state-of-the-art.

AIMar 31, 2023

Interval Logic Tensor Networks

Samy Badreddine, Gianluca Apriceno, Andrea Passerini et al.

In this paper, we introduce Interval Real Logic (IRL), a two-sorted logic that interprets knowledge such as sequential properties (traces) and event properties using sequences of real-featured data. We interpret connectives using fuzzy logic, event durations using trapezoidal fuzzy intervals, and fuzzy temporal relations using relationships between the intervals' areas. We propose Interval Logic Tensor Networks (ILTN), a neuro-symbolic system that learns by propagating gradients through IRL. In order to support effective learning, ILTN defines smoothened versions of the fuzzy intervals and temporal relations of IRL using softplus activations. We show that ILTN can successfully leverage knowledge expressed in IRL in synthetic tasks that require reasoning about events to predict their fuzzy durations. Our results show that the system is capable of making events compliant with background temporal knowledge.

AISep 24, 2024

LTNtorch: PyTorch Implementation of Logic Tensor Networks

Tommaso Carraro, Luciano Serafini, Fabio Aiolli

Logic Tensor Networks (LTN) is a Neuro-Symbolic framework that effectively incorporates deep learning and logical reasoning. In particular, LTN allows defining a logical knowledge base and using it as the objective of a neural model. This makes learning by logical reasoning possible as the parameters of the model are optimized by minimizing a loss function composed of a set of logical formulas expressing facts about the learning task. The framework learns via gradient-descent optimization. Fuzzy logic, a relaxation of classical logic permitting continuous truth values in the interval [0,1], makes this learning possible. Specifically, the training of an LTN consists of three steps. Firstly, (1) the training data is used to ground the formulas. Then, (2) the formulas are evaluated, and the loss function is computed. Lastly, (3) the gradients are back-propagated through the logical computational graph, and the weights of the neural model are changed so the knowledge base is maximally satisfied. LTNtorch is the fully documented and tested PyTorch implementation of Logic Tensor Networks. This paper presents the formalization of LTN and how LTNtorch implements it. Moreover, it provides a basic binary classification example.

AIMay 9

Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?

Filippo Ziliotto, Ciro Beneduce, Bruno Lepri et al.

In the animal kingdom, mirror self-recognition is a canonical probe of higher-order cognition, emerging only in some species. We ask whether an analogous functional capability emerges in embodied vision-language model (VLM) agents: can they recognize themselves in a mirror? We introduce a controlled 3D benchmark where a first-person VLM agent must infer a hidden body attribute from its reflection and select the matching target, while avoiding self-other misattribution. To separate mirror-grounded self-identification from shortcuts, we test mirror removal, misleading cues, and occluded reflections. We also evaluate the decision process through mirror seeking, temporal ordering, self-attribution, and reasoning-action consistency. Our experiments show that mirror-based self-identification emerges mainly in stronger VLMs. These models can use reflected evidence for action, whereas weaker models often inspect the mirror but fail to extract self-relevant information or misattribute their reflection. Language-vision conflict further shows that self-referential language alone is not evidence of grounded self-identification. Overall, mirror-based evaluation provides a diagnostic for whether embodied self-grounding is causally rooted in perception and action rather than priors, prompt compliance, or confabulation.

AIDec 5, 2024

TANGO: Training-free Embodied AI Agents for Open-world Tasks

Filippo Ziliotto, Tommaso Campari, Luciano Serafini et al.

Large Language Models (LLMs) have demonstrated excellent capabilities in composing various modules together to create programs that can perform complex reasoning tasks on images. In this paper, we propose TANGO, an approach that extends the program composition via LLMs already observed for images, aiming to integrate those capabilities into embodied agents capable of observing and acting in the world. Specifically, by employing a simple PointGoal Navigation model combined with a memory-based exploration policy as a foundational primitive for guiding an agent through the world, we show how a single model can address diverse tasks without additional training. We task an LLM with composing the provided primitives to solve a specific task, using only a few in-context examples in the prompt. We evaluate our approach on three key Embodied AI tasks: Open-Set ObjectGoal Navigation, Multi-Modal Lifelong Navigation, and Open Embodied Question Answering, achieving state-of-the-art results without any specific fine-tuning in challenging zero-shot scenarios.

LGMar 2

Discrete World Models via Regularization

Davide Bizzaro, Luciano Serafini

World models aim to capture the states and dynamics of an environment in a compact latent space. Moreover, using Boolean state representations is particularly useful for search heuristics and symbolic reasoning and planning. Existing approaches keep latents informative via decoder-based reconstruction, or instead via contrastive or reward signals. In this work, we introduce Discrete World Models via Regularization (DWMR): a reconstruction-free and contrastive-free method for unsupervised Boolean world-model learning. In particular, we introduce a novel world-modeling loss that couples latent prediction with specialized regularizers. Such regularizers maximize the entropy and independence of the representation bits through variance, correlation, and coskewness penalties, while simultaneously enforcing a locality prior for sparse action changes. To enable effective optimization, we also introduce a novel training scheme improving robustness to discrete roll-outs. Experiments on two benchmarks with underlying combinatorial structure show that DWMR learns more accurate representations and transitions than reconstruction-based alternatives. Finally, DWMR can also be paired with an auxiliary reconstruction decoder, and this combination yields additional gains.

LGFeb 21, 2024

Simple and Effective Transfer Learning for Neuro-Symbolic Integration

Alessandro Daniele, Tommaso Campari, Sagar Malhotra et al.

Deep Learning (DL) techniques have achieved remarkable successes in recent years. However, their ability to generalize and execute reasoning tasks remains a challenge. A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning. Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task. These methods exhibit superior generalization capacity compared to fully neural architectures. However, they suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima. This paper proposes a simple yet effective method to ameliorate these problems. The key idea involves pretraining a neural model on the downstream task. Then, a NeSy model is trained on the same task via transfer learning, where the weights of the perceptual part are injected from the pretrained network. The key observation of our work is that the neural network fails to generalize only at the level of the symbolic part while being perfectly capable of learning the mapping from perceptions to symbols. We have tested our training strategy on various SOTA NeSy methods and datasets, demonstrating consistent improvements in the aforementioned problems.

AINov 23, 2024

Aligning Generalisation Between Humans and Machines

Filip Ilievski, Barbara Hammer, Frank van Harmelen et al.

Recent advances in AI -- including generative approaches -- have resulted in technology that can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals. The responsible use of AI and its participation in human-AI teams increasingly shows the need for AI alignment, that is, to make AI systems act according to our preferences. A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise. In cognitive science, human generalisation commonly involves abstraction and concept learning. In contrast, AI generalisation encompasses out-of-domain generalisation in machine learning, rule-based reasoning in symbolic AI, and abstraction in neurosymbolic AI. In this perspective paper, we combine insights from AI and cognitive science to identify key commonalities and differences across three dimensions: notions of, methods for, and evaluation of generalisation. We map the different conceptualisations of generalisation in AI and cognitive science along these three dimensions and consider their role for alignment in human-AI teaming. This results in interdisciplinary challenges across AI and cognitive science that must be tackled to provide a foundation for effective and cognitively supported alignment in human-AI teaming scenarios.

CVSep 24, 2025

PersONAL: Towards a Comprehensive Benchmark for Personalized Embodied Agents

Filippo Ziliotto, Jelin Raphael Akkara, Alessandro Daniele et al.

Recent advances in Embodied AI have enabled agents to perform increasingly complex tasks and adapt to diverse environments. However, deploying such agents in realistic human-centered scenarios, such as domestic households, remains challenging, particularly due to the difficulty of modeling individual human preferences and behaviors. In this work, we introduce PersONAL (PERSonalized Object Navigation And Localization, a comprehensive benchmark designed to study personalization in Embodied AI. Agents must identify, retrieve, and navigate to objects associated with specific users, responding to natural-language queries such as "find Lily's backpack". PersONAL comprises over 2,000 high-quality episodes across 30+ photorealistic homes from the HM3D dataset. Each episode includes a natural-language scene description with explicit associations between objects and their owners, requiring agents to reason over user-specific semantics. The benchmark supports two evaluation modes: (1) active navigation in unseen environments, and (2) object grounding in previously mapped scenes. Experiments with state-of-the-art baselines reveal a substantial gap to human performance, highlighting the need for embodied agents capable of perceiving, reasoning, and memorizing over personalized information; paving the way towards real-world assistive robot.

AIJun 27, 2025

Breaking Rank Bottlenecks in Knowledge Graph Embeddings

Samy Badreddine, Emile van Krieken, Luciano Serafini

Many knowledge graph embedding (KGE) models for link prediction use powerful encoders. However, they often rely on a simple hidden vector-matrix multiplication to score subject-relation queries against candidate object entities. When the number of entities is larger than the model's embedding dimension, which is often the case in practice by several orders of magnitude, we have a linear output layer with a rank bottleneck. Such bottlenecked layers limit model expressivity. We investigate both theoretically and empirically how rank bottlenecks affect KGEs. We find that, by limiting the set of feasible predictions, rank bottlenecks hurt the ranking accuracy and distribution fidelity of scores. Inspired by the language modelling literature, we propose KGE-MoS, a mixture-based output layer to break rank bottlenecks in many KGEs. Our experiments show that KGE-MoS improves ranking performance of KGE models on large-scale datasets at a low parameter cost.

CVMay 18, 2023

Weakly-Supervised Visual-Textual Grounding with Semantic Prior Refinement

Davide Rigoni, Luca Parolari, Luciano Serafini et al.

Using only image-sentence pairs, weakly-supervised visual-textual grounding aims to learn region-phrase correspondences of the respective entity mentions. Compared to the supervised approach, learning is more difficult since bounding boxes and textual phrases correspondences are unavailable. In light of this, we propose the Semantic Prior Refinement Model (SPRM), whose predictions are obtained by combining the output of two main modules. The first untrained module aims to return a rough alignment between textual phrases and bounding boxes. The second trained module is composed of two sub-components that refine the rough alignment to improve the accuracy of the final phrase-bounding box alignments. The model is trained to maximize the multimodal similarity between an image and a sentence, while minimizing the multimodal similarity of the same sentence and a new unrelated image, carefully selected to help the most during training. Our approach shows state-of-the-art results on two popular datasets, Flickr30k Entities and ReferIt, shining especially on ReferIt with a 9.6% absolute improvement. Moreover, thanks to the untrained component, it reaches competitive performances just using a small fraction of training examples.

AIDec 29, 2021

On some Foundational Aspects of Human-Centered Artificial Intelligence

Luciano Serafini, Raul Barbosa, Jasmin Grosinger et al.

The burgeoning of AI has prompted recommendations that AI techniques should be "human-centered". However, there is no clear definition of what is meant by Human Centered Artificial Intelligence, or for short, HCAI. This paper aims to improve this situation by addressing some foundational aspects of HCAI. To do so, we introduce the term HCAI agent to refer to any physical or software computational agent equipped with AI components and that interacts and/or collaborates with humans. This article identifies five main conceptual components that participate in an HCAI agent: Observations, Requirements, Actions, Explanations and Models. We see the notion of HCAI agent, together with its components and functions, as a way to bridge the technical and non-technical discussions on human-centered AI. In this paper, we focus our analysis on scenarios consisting of a single agent operating in dynamic environments in presence of humans.

AIDec 18, 2021

Online Grounding of Symbolic Planning Domains in Unknown Environments

Leonardo Lamanna, Luciano Serafini, Alessandro Saetti et al.

If a robotic agent wants to exploit symbolic planning techniques to achieve some goal, it must be able to properly ground an abstract planning domain in the environment in which it operates. However, if the environment is initially unknown by the agent, the agent needs to explore it and discover the salient aspects of the environment needed to reach its goals. Namely, the agent has to discover: (i) the objects present in the environment, (ii) the properties of these objects and their relations, and finally (iii) how abstract actions can be successfully executed. The paper proposes a framework that aims to accomplish the aforementioned perspective for an agent that perceives the environment partially and subjectively, through real value sensors (e.g., GPS, and on-board camera) and can operate in the environment through low level actuators (e.g., move forward of 20 cm). We evaluate the proposed architecture in photo-realistic simulated environments, where the sensors are RGB-D on-board camera, GPS and compass, and low level actions include movements, grasping/releasing objects, and manipulating objects. The agent is placed in an unknown environment and asked to find objects of a certain type, place an object on top of another, close or open an object of a certain type. We compare our approach with the state of the art methods on object goal navigation based on reinforcement learning, showing better performances.

LOOct 12, 2021

Weighted Model Counting in FO2 with Cardinality Constraints and Counting Quantifiers: A Closed Form Formula

Sagar Malhotra, Luciano Serafini

Weighted First-Order Model Counting (WFOMC) computes the weighted sum of the models of a first-order logic theory on a given finite domain. First-Order Logic theories that admit polynomial-time WFOMC w.r.t domain cardinality are called domain liftable. We introduce the concept of lifted interpretations as a tool for formulating closed-forms for WFOMC. Using lifted interpretations, we reconstruct the closed-form formula for polynomial-time FOMC in the universally quantified fragment of FO2, earlier proposed by Beame et al. We then expand this closed-form to incorporate cardinality constraints, existential quantifiers, and counting quantifiers (a.k.a C2) without losing domain-liftability. Finally, we show that the obtained closed-form motivates a natural definition of a family of weight functions strictly larger than symmetric weight functions.

AISep 22, 2021

A formalisation of BPMN in Description Logics

Chiara Ghidini, Marco Rospocher, Luciano Serafini

In this paper we present a textual description, in terms of Description Logics, of the BPMN Ontology, which provides a clear semantic formalisation of the structural components of the Business Process Modelling Notation (BPMN), based on the latest stable BPMN specifications from OMG [BPMN Version 1.1 -- January 2008]. The development of the ontology was guided by the description of the complete set of BPMN Element Attributes and Types contained in Annex B of the BPMN specifications.

CVAug 11, 2021

A Better Loss for Visual-Textual Grounding

Davide Rigoni, Luciano Serafini, Alessandro Sperduti

Given a textual phrase and an image, the visual grounding problem is the task of locating the content of the image referenced by the sentence. It is a challenging task that has several real-world applications in human-computer interaction, image-text reference resolution, and video-text reference resolution. In the last years, several works have addressed this problem by proposing more and more large and complex models that try to capture visual-textual dependencies better than before. These models are typically constituted by two main components that focus on how to learn useful multi-modal features for grounding and how to improve the predicted bounding box of the visual mention, respectively. Finding the right learning balance between these two sub-tasks is not easy, and the current models are not necessarily optimal with respect to this issue. In this work, we propose a loss function based on bounding boxes classes probabilities that: (i) improves the bounding boxes selection; (ii) improves the bounding boxes coordinates prediction. Our model, although using a simple multi-modal feature fusion component, is able to achieve a higher accuracy than state-of-the-art models on two widely adopted datasets, reaching a better learning balance between the two sub-tasks mentioned above.

AIJun 28, 2021

Reasoning on $\textit{DL-Lite}_{\cal R}$ with Defeasibility in ASP

Loris Bozzato, Thomas Eiter, Luciano Serafini

Reasoning on defeasible knowledge is a topic of interest in the area of description logics, as it is related to the need of representing exceptional instances in knowledge bases. In this direction, in our previous works we presented a framework for representing (contextualized) OWL RL knowledge bases with a notion of justified exceptions on defeasible axioms: reasoning in such framework is realized by a translation into ASP programs. The resulting reasoning process for OWL RL, however, introduces a complex encoding in order to capture reasoning on the negative information needed for reasoning on exceptions. In this paper, we apply the justified exception approach to knowledge bases in $\textit{DL-Lite}_{\cal R}$, i.e., the language underlying OWL QL. We provide a definition for $\textit{DL-Lite}_{\cal R}$ knowledge bases with defeasible axioms and study their semantic and computational properties. In particular, we study the effects of exceptions over unnamed individuals. The limited form of $\textit{DL-Lite}_{\cal R}$ axioms allows us to formulate a simpler ASP encoding, where reasoning on negative information is managed by direct rules. The resulting materialization method gives rise to a complete reasoning procedure for instance checking in $\textit{DL-Lite}_{\cal R}$ with defeasible axioms. Under consideration in Theory and Practice of Logic Programming (TPLP).

AIDec 25, 2020

Logic Tensor Networks

Samy Badreddine, Artur d'Avila Garcez, Luciano Serafini et al.

Artificial Intelligence agents are required to learn from their surroundings and to reason about the knowledge that has been learned in order to make decisions. While state-of-the-art learning from data typically uses sub-symbolic distributed representations, reasoning is normally useful at a higher level of abstraction with the use of a first-order logic language for knowledge representation. As a result, attempts at combining symbolic AI and neural computation into neural-symbolic systems have been on the increase. In this paper, we present Logic Tensor Networks (LTN), a neurosymbolic formalism and computational model that supports learning and reasoning through the introduction of a many-valued, end-to-end differentiable first-order logic called Real Logic as a representation language for deep learning. We show that LTN provides a uniform language for the specification and the computation of several AI tasks such as data clustering, multi-label classification, relational learning, query answering, semi-supervised learning, regression and embedding learning. We implement and illustrate each of the above tasks with a number of simple explanatory examples using TensorFlow 2. Keywords: Neurosymbolic AI, Deep Learning and Reasoning, Many-valued Logic.

AISep 25, 2020

Weighted Model Counting in the two variable fragment with Cardinality Constraints: A Closed Form Formula

Sagar Malhotra, Luciano Serafini

Weighted First-Order Model Counting (WFOMC) computes the weighted sum of the models of a first-order theory on a given finite domain. WFOMC has emerged as a fundamental tool for probabilistic inference. Algorithms for WFOMC that run in polynomial time w.r.t. the domain size are called lifted inference algorithms. Such algorithms have been developed for multiple extensions of FO2(the fragment of first-order logic with two variables) for the special case of symmetric weight functions. We introduce the concept of lifted interpretations as a tool for formulating polynomials for WFOMC. Using lifted interpretations, we reconstruct the closed-form formula for polynomial-time FOMC in the universal fragment of FO2, earlier proposed by Beame et al. We then expand this closed-form to incorporate existential quantifiers and cardinality constraints without losing domain-liftability. Finally, we show that the obtained closed-form motivates a natural definition of a family of weight functions strictly larger than symmetric weight functions.

LGSep 13, 2020

Neural Networks Enhancement with Logical Knowledge

Alessandro Daniele, Luciano Serafini

In the recent past, there has been a growing interest in Neural-Symbolic Integration frameworks, i.e., hybrid systems that integrate connectionist and symbolic approaches to obtain the best of both worlds. In a previous work, we proposed KENN (Knowledge Enhanced Neural Networks), a Neural-Symbolic architecture that injects prior logical knowledge into a neural network by adding a new final layer which modifies the initial predictions accordingly to the knowledge. Among the advantages of this strategy, there is the inclusion of clause weights, learnable parameters that represent the strength of the clauses, meaning that the model can learn the impact of each clause on the final predictions. As a special case, if the training data contradicts a constraint, KENN learns to ignore it, making the system robust to the presence of wrong knowledge. In this paper, we propose an extension of KENN for relational data. To evaluate this new extension, we tested it with different learning configurations on Citeseer, a standard dataset for Collective Classification. The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data, outperforming other two notable methods that combine learning with logic.

CVAug 21, 2020

Exploiting Scene-specific Features for Object Goal Navigation

Tommaso Campari, Paolo Eccher, Luciano Serafini et al.

Can the intrinsic relation between an object and the room in which it is usually located help agents in the Visual Navigation Task? We study this question in the context of Object Navigation, a problem in which an agent has to reach an object of a specific class while moving in a complex domestic environment. In this paper, we introduce a new reduced dataset that speeds up the training of navigation models, a notoriously complex task. Our proposed dataset permits the training of models that do not exploit online-built maps in reasonable times even without the use of huge computational resources. Therefore, this reduced dataset guarantees a significant benchmark and it can be used to identify promising models that could be then tried on bigger and more challenging datasets. Subsequently, we propose the SMTSC model, an attention-based model capable of exploiting the correlation between scenes and objects contained in them, highlighting quantitatively how the idea is correct.

LGOct 1, 2019

Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation

Ivan Donadello, Luciano Serafini

Semantic Image Interpretation is the task of extracting a structured semantic description from images. This requires the detection of visual relationships: triples (subject,relation,object) describing a semantic relation between a subject and an object. A pure supervised approach to visual relationship detection requires a complete and balanced training set for all the possible combinations of (subject, relation, object). However, such training sets are not available and would require a prohibitive human effort. This implies the ability of predicting triples which do not appear in the training set. This problem is called zero-shot learning. State-of-the-art approaches to zero-shot learning exploit similarities among relationships in the training set or external linguistic knowledge. In this paper, we perform zero-shot learning by using Logic Tensor Networks, a novel Statistical Relational Learning framework that exploits both the similarities with other seen relationships and background knowledge, expressed with logical constraints between subjects, relations and objects. The experiments on the Visual Relationship Dataset show that the use of logical constraints outperforms the current methods. This implies that background knowledge can be used to alleviate the incompleteness of training sets.

LOMay 22, 2019

A Note on Reasoning on $\textit{DL-Lite}_{\cal R}$ with Defeasibility

Loris Bozzato, Thomas Eiter, Luciano Serafini

Representation of defeasible information is of interest in description logics, as it is related to the need of accommodating exceptional instances in knowledge bases. In this direction, in our previous works we presented a datalog translation for reasoning on (contextualized) OWL RL knowledge bases with a notion of justified exceptions on defeasible axioms. While it covers a relevant fragment of OWL, the resulting reasoning process needs a complex encoding in order to capture reasoning on negative information. In this paper, we consider the case of knowledge bases in $\textit{DL-Lite}_{\cal R}$, i.e. the language underlying OWL QL. We provide a definition for $\textit{DL-Lite}_{\cal R}$ knowledge bases with defeasible axioms and study their properties. The limited form of $\textit{DL-Lite}_{\cal R}$ axioms allows us to formulate a simpler encoding into datalog (under answer set semantics) with direct rules for reasoning on negative information. The resulting materialization method gives rise to a complete reasoning procedure for instance checking in $\textit{DL-Lite}_{\cal R}$ with defeasible axioms.

AIMay 15, 2019

Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning

Artur d'Avila Garcez, Marco Gori, Luis C. Lamb et al.

Current advances in Artificial Intelligence and machine learning in general, and deep learning in particular have reached unprecedented impact not only across research communities, but also over popular media channels. However, concerns about interpretability and accountability of AI have been raised by influential thinkers. In spite of the recent impact of AI, several works have identified the need for principled knowledge representation and reasoning mechanisms integrated with deep learning-based systems to provide sound and explainable models for such systems. Neural-symbolic computing aims at integrating, as foreseen by Valiant, two most fundamental cognitive abilities: the ability to learn from the environment, and the ability to reason from what has been learned. Neural-symbolic computing has been an active topic of research for many years, reconciling the advantages of robust learning in neural networks and reasoning and interpretability of symbolic representation. In this paper, we survey recent accomplishments of neural-symbolic computing as a principled methodology for integrated machine learning and reasoning. We illustrate the effectiveness of the approach by outlining the main characteristics of the methodology: principled integration of neural learning with symbolic knowledge representation and reasoning allowing for the construction of explainable AI systems. The insights provided by neural-symbolic computing shed new light on the increasingly prominent need for interpretable and accountable AI systems.

AIMar 14, 2019

Incremental Learning of Discrete Planning Domains from Continuous Perceptions

Luciano Serafini, Paolo Traverso

We propose a framework for learning discrete deterministic planning domains. In this framework, an agent learns the domain by observing the action effects through continuous features that describe the state of the environment after the execution of each action. Besides, the agent learns its perception function, i.e., a probabilistic mapping between state variables and sensor data represented as a vector of continuous random variables called perception variables. We define an algorithm that updates the planning domain and the perception function by (i) introducing new states, either by extending the possible values of state variables, or by weakening their constraints; (ii) adapts the perception function to fit the observed data (iii) adapts the transition function on the basis of the executed actions and the effects observed via the perception function. The framework is able to deal with exogenous events that happen in the environment.

AIOct 16, 2018

Incremental learning abstract discrete planning domains and mappings to continuous perceptions

Luciano Serafini, Paolo Traverso

Most of the works on planning and learning, e.g., planning by (model based) reinforcement learning, are based on two main assumptions: (i) the set of states of the planning domain is fixed; (ii) the mapping between the observations from the real word and the states is implicitly assumed or learned offline, and it is not part of the planning domain. Consequently, the focus is on learning the transitions between states. In this paper, we drop such assumptions. We provide a formal framework in which (i) the agent can learn dynamically new states of the planning domain; (ii) the mapping between abstract states and the perception from the real world, represented by continuous variables, is part of the planning domain; (iii) such mapping is learned and updated along the "life" of the agent. We define an algorithm that interleaves planning, acting, and learning, and allows the agent to update the planning domain depending on how much it trusts the model w.r.t. the new experiences learned by executing actions. We define a measure of coherence between the planning domain and the real world as perceived by the agent. We test our approach showing that the agent learns increasingly coherent models, and that the system can scale to deal with models with an order of $10^6$ states.

AIAug 6, 2018

Reasoning with Justifiable Exceptions in Contextual Hierarchies (Appendix)

Loris Bozzato, Luciano Serafini, Thomas Eiter

This paper is an appendix to the paper "Reasoning with Justifiable Exceptions in Contextual Hierarchies" by Bozzato, Serafini and Eiter, 2018. It provides further details on the language, the complexity results and the datalog translation introduced in the main paper.

AIMay 24, 2017

Logic Tensor Networks for Semantic Image Interpretation

Ivan Donadello, Luciano Serafini, Artur d'Avila Garcez

Semantic Image Interpretation (SII) is the task of extracting structured semantic descriptions from images. It is widely agreed that the combined use of visual data and background knowledge is of great importance for SII. Recently, Statistical Relational Learning (SRL) approaches have been developed for reasoning under uncertainty and learning in the presence of data and rich knowledge. Logic Tensor Networks (LTNs) are an SRL framework which integrates neural networks with first-order fuzzy logic to allow (i) efficient learning from noisy data in the presence of logical constraints, and (ii) reasoning with logical formulas describing general properties of the data. In this paper, we develop and apply LTNs to two of the main tasks of SII, namely, the classification of an image's bounding boxes and the detection of the relevant part-of relations between objects. To the best of our knowledge, this is the first successful application of SRL to such SII tasks. The proposed approach is evaluated on a standard image processing benchmark. Experiments show that the use of background knowledge in the form of logical constraints can improve the performance of purely data-driven approaches, including the state-of-the-art Fast Region-based Convolutional Neural Networks (Fast R-CNN). Moreover, we show that the use of logical background knowledge adds robustness to the learning system when errors are present in the labels of the training data.

AIDec 1, 2016

On Coreferring Text-extracted Event Descriptions with the aid of Ontological Reasoning

Stefano Borgo, Loris Bozzato, Alessio Palmero Aprosio et al.

Systems for automatic extraction of semantic information about events from large textual resources are now available: these tools are capable to generate RDF datasets about text extracted events and this knowledge can be used to reason over the recognized events. On the other hand, text based tasks for event recognition, as for example event coreference (i.e. recognizing whether two textual descriptions refer to the same event), do not take into account ontological information of the extracted events in their process. In this paper, we propose a method to derive event coreference on text extracted event data using semantic based rule reasoning. We demonstrate our method considering a limited (yet representative) set of event types: we introduce a formal analysis on their ontological properties and, on the base of this, we define a set of coreference criteria. We then implement these criteria as RDF-based reasoning rules to be applied on text extracted event data. We evaluate the effectiveness of our approach over a standard coreference benchmark dataset.

AIJun 14, 2016

Logic Tensor Networks: Deep Learning and Logical Reasoning from Data and Knowledge

Luciano Serafini, Artur d'Avila Garcez

We propose Logic Tensor Networks: a uniform framework for integrating automatic learning and reasoning. A logic formalism called Real Logic is defined on a first-order language whereby formulas have truth-value in the interval [0,1] and semantics defined concretely on the domain of real numbers. Logical constants are interpreted as feature vectors of real numbers. Real Logic promotes a well-founded integration of deductive reasoning on a knowledge-base and efficient data-driven relational machine learning. We show how Real Logic can be implemented in deep Tensor Neural Networks with the use of Google's tensorflow primitives. The paper concludes with experiments applying Logic Tensor Networks on a simple but representative example of knowledge completion.

DBDec 12, 2015

Query Answering over Contextualized RDF/OWL Knowledge with Forall-Existential Bridge Rules: Decidable Finite Extension Classes (Post Print)

Mathew Joseph, Gabriel Kuper, Till Mossakowski et al.

The proliferation of contextualized knowledge in the Semantic Web (SW) has led to the popularity of knowledge formats such as \emph{quads} in the SW community. A quad is an extension of an RDF triple with contextual information of the triple. In this paper, we study the problem of query answering over quads augmented with forall-existential bridge rules that enable interoperability of reasoning between triples in various contexts. We call a set of quads together with such expressive bridge rules, a quad-system. Query answering over quad-systems is undecidable, in general. We derive decidable classes of quad-systems, for which query answering can be done using forward chaining. Sound, complete and terminating procedures, which are adaptations of the well known chase algorithm, are provided for these classes for deciding query entailment. Safe, msafe, and csafe class of quad-systems restrict the structure of blank nodes generated during the chase computation process to be directed acyclic graphs (DAGs) of bounded depth. RR and restricted RR classes do not allow the generation of blank nodes during the chase computation process. Both data and combined complexity of query entailment has been established for the classes derived. We further show that quad-systems are equivalent to forall-existential rules whose predicates are restricted to ternary arity, modulo polynomial time translations. We subsequently show that the technique of safety, strictly subsumes in expressivity, some of the well known and expressive techniques, such as joint acyclicity and model faithful acyclicity, used for decidability guarantees in the realm of forall-existential rules.

AIApr 22, 2015

Semantic Enrichment of Mobile Phone Data Records Using Background Knowledge

Zolzaya Dashdorj, Stanislav Sobolevsky, Luciano Serafini et al.

Every day, billions of mobile network events (i.e. CDRs) are generated by cellular phone operator companies. Latent in this data are inspiring insights about human actions and behaviors, the discovery of which is important because context-aware applications and services hold the key to user-driven, intelligent services, which can enhance our everyday lives such as social and economic development, urban planning, and health prevention. The major challenge in this area is that interpreting such a big stream of data requires a deep understanding of mobile network events' context through available background knowledge. This article addresses the issues in context awareness given heterogeneous and uncertain data of mobile network events missing reliable information on the context of this activity. The contribution of this research is a model from a combination of logical and statistical reasoning standpoints for enabling human activity inference in qualitative terms from open geographical data that aimed at improving the quality of human behaviors recognition tasks from CDRs. We use open geographical data, Openstreetmap (OSM), as a proxy for predicting the content of human activity in the area. The user study performed in Trento shows that predicted human activities (top level) match the survey data with around 93% overall accuracy. The extensive validation for predicting a more specific economic type of human activity performed in Barcelona, by employing credit card transaction data. The analysis identifies that appropriately normalized data on points of interest (POI) is a good proxy for predicting human economical activities, with 84% accuracy on average. So the model is proven to be efficient for predicting the context of human activity, when its total level could be efficiently observed from cell phone data records, missing contextual information however.

AIDec 26, 2014

Knowledge Propagation in Contextualized Knowledge Repositories: an Experimental Evaluation

Loris Bozzato, Luciano Serafini

As the interest in the representation of context dependent knowledge in the Semantic Web has been recognized, a number of logic based solutions have been proposed in this regard. In our recent works, in response to this need, we presented the description logic-based Contextualized Knowledge Repository (CKR) framework. CKR is not only a theoretical framework, but it has been effectively implemented over state-of-the-art tools for the management of Semantic Web data: inference inside and across contexts has been realized in the form of forward SPARQL-based rules over different RDF named graphs. In this paper we present the first evaluation results for such CKR implementation. In particular, in first experiment we study its scalability with respect to different reasoning regimes. In a second experiment we analyze the effects of knowledge propagation on the computation of inferences.

LOJun 3, 2014

Query Answering over Contextualized RDF/OWL Knowledge with Forall-Existential Bridge Rules: Attaining Decidability using Acyclicity (full version)

Mathew Joseph, Gabriel Kuper, Luciano Serafini

The recent outburst of context-dependent knowledge on the Semantic Web (SW) has led to the realization of the importance of the quads in the SW community. Quads, which extend a standard RDF triple, by adding a new parameter of the `context' of an RDF triple, thus informs a reasoner to distinguish between the knowledge in various contexts. Although this distinction separates the triples in an RDF graph into various contexts, and allows the reasoning to be decoupled across various contexts, bridge rules need to be provided for inter-operating the knowledge across these contexts. We call a set of quads together with the bridge rules, a quad-system. In this paper, we discuss the problem of query answering over quad-systems with expressive forall-existential bridge rules. It turns out the query answering over quad-systems is undecidable, in general. We derive a decidable class of quad-systems, namely context-acyclic quad-systems, for which query answering can be done using forward chaining. Tight bounds for data and combined complexity of query entailment has been established for the derived class.