Erman Acar

h-index9

26papers

446citations

Novelty49%

AI Score49

Ranked #25,495 of 194,257 authors (top 13%)#1,332 in AI (top 11%)

26 Papers

28.7CVMar 11, 2022Code

Hyperbolic Image Segmentation

Mina GhadimiAtigh, Julian Schoep, Erman Acar et al.

For image segmentation, the current standard is to perform pixel-level optimization and inference in Euclidean output embedding spaces through linear hyperplanes. In this work, we show that hyperbolic manifolds provide a valuable alternative for image segmentation and propose a tractable formulation of hierarchical pixel-level classification in hyperbolic space. Hyperbolic Image Segmentation opens up new possibilities and practical benefits for segmentation, such as uncertainty estimation and boundary information for free, zero-label generalization, and increased performance in low-dimensional output embeddings.

11.8LGJul 18, 2022Code

A Meta-Reinforcement Learning Algorithm for Causal Discovery

Andreas Sauter, Erman Acar, Vincent François-Lavet

Causal discovery is a major task with the utmost importance for machine learning since causal structures can enable models to go beyond pure correlation-based inference and significantly boost their performance. However, finding causal structures from data poses a significant challenge both in computational effort and accuracy, let alone its impossibility without interventions in general. In this paper, we develop a meta-reinforcement learning algorithm that performs causal discovery by learning to perform interventions such that it can construct an explicit causal graph. Apart from being useful for possible downstream applications, the estimated causal graph also provides an explanation for the data-generating process. In this article, we show that our algorithm estimates a good graph compared to the SOTA approaches, even in environments whose underlying causal structure is previously unseen. Further, we make an ablation study that shows how learning interventions contribute to the overall performance of our approach. We conclude that interventions indeed help boost the performance, efficiently yielding an accurate estimate of the causal structure of a possibly unseen environment.

1.2CYJul 26, 2023

AI4GCC-Team -- Below Sea Level: Score and Real World Relevance

Phillip Wozny, Bram Renting, Robert Loftin et al.

As our submission for track three of the AI for Global Climate Cooperation (AI4GCC) competition, we propose a negotiation protocol for use in the RICE-N climate-economic simulation. Our proposal seeks to address the challenges of carbon leakage through methods inspired by the Carbon Border Adjustment Mechanism (CBAM) and Climate Clubs (CC). We demonstrate the effectiveness of our approach by comparing simulated outcomes to representative concentration pathways (RCP) and shared socioeconomic pathways (SSP). Our protocol results in a temperature rise comparable to RCP 3.4/4.5 and SSP 2. Furthermore, we provide an analysis of our protocol's World Trade Organization compliance, administrative and political feasibility, and ethical concerns. We recognize that our proposal risks hurting the least developing countries, and we suggest specific corrective measures to avoid exacerbating existing inequalities, such as technology sharing and wealth redistribution. Future research should improve the RICE-N tariff mechanism and implement actions allowing for the aforementioned corrective measures.

10.4LGJan 30, 2024Code

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Andreas W. M. Sauter, Nicolò Botteghi, Erman Acar et al.

Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at https://github.com/sa-and/CORE

5.4AIJan 4, 2023

A Meta-Learning Algorithm for Interrogative Agendas

Erman Acar, Andrea De Domenico, Krishna Manoorkar et al.

Explainability is a key challenge and a major research theme in AI research for developing intelligent systems that are capable of working with humans more effectively. An obvious choice in developing explainable intelligent systems relies on employing knowledge representation formalisms which are inherently tailored towards expressing human knowledge e.g., interrogative agendas. In the scope of this work, we focus on formal concept analysis (FCA), a standard knowledge representation formalism, to express interrogative agendas, and in particular to categorize objects w.r.t. a given set of features. Several FCA-based algorithms have already been in use for standard machine learning tasks such as classification and outlier detection. These algorithms use a single concept lattice for such a task, meaning that the set of features used for the categorization is fixed. Different sets of features may have different importance in that categorization, we call a set of features an agenda. In many applications a correct or good agenda for categorization is not known beforehand. In this paper, we propose a meta-learning algorithm to construct a good interrogative agenda explaining the data. Such algorithm is meant to call existing FCA-based classification and outlier detection algorithms iteratively, to increase their accuracy and reduce their sample complexity. The proposed method assigns a measure of importance to different set of features used in the categorization, hence making the results more explainable.

1.2CYJul 26, 2023

AI4GCC - Team: Below Sea Level: Critiques and Improvements

Bram Renting, Phillip Wozny, Robert Loftin et al.

We present a critical analysis of the simulation framework RICE-N, an integrated assessment model (IAM) for evaluating the impacts of climate change on the economy. We identify key issues with RICE-N, including action masking and irrelevant actions, and suggest improvements such as utilizing tariff revenue and penalizing overproduction. We also critically engage with features of IAMs in general, namely overly optimistic damage functions and unrealistic abatement cost functions. Our findings contribute to the ongoing efforts to further develop the RICE-N framework in an effort to improve the simulation, making it more useful as an inspiration for policymakers.

5.8AIMay 21, 2024Code

CausalPlayground: Addressing Data-Generation Requirements in Cutting-Edge Causality Research

Andreas W M Sauter, Erman Acar, Aske Plaat

Research on causal effects often relies on synthetic data due to the scarcity of real-world datasets with ground-truth effects. Since current data-generating tools do not always meet all requirements for state-of-the-art research, ad-hoc methods are often employed. This leads to heterogeneity among datasets and delays research progress. We address the shortcomings of current data-generating libraries by introducing CausalPlayground, a Python library that provides a standardized platform for generating, sampling, and sharing structural causal models (SCMs). CausalPlayground offers fine-grained control over SCMs, interventions, and the generation of datasets of SCMs for learning and quantitative research. Furthermore, by integrating with Gymnasium, the standard framework for reinforcement learning (RL) environments, we enable online interaction with the SCMs. Overall, by introducing CausalPlayground we aim to foster more efficient and comparable research in the field. All code and API documentation is available at https://github.com/sa-and/CausalPlayground.

5.2CVSep 2, 2024

DAVIDE: Depth-Aware Video Deblurring

German F. Torres, Jussi Kalliola, Soumya Tripathy et al.

Video deblurring aims at recovering sharp details from a sequence of blurry frames. Despite the proliferation of depth sensors in mobile phones and the potential of depth information to guide deblurring, depth-aware deblurring has received only limited attention. In this work, we introduce the 'Depth-Aware VIdeo DEblurring' (DAVIDE) dataset to study the impact of depth information in video deblurring. The dataset comprises synchronized blurred, sharp, and depth videos. We investigate how the depth information should be injected into the existing deep RGB video deblurring models, and propose a strong baseline for depth-aware video deblurring. Our findings reveal the significance of depth information in video deblurring and provide insights into the use cases where depth cues are beneficial. In addition, our results demonstrate that while the depth improves deblurring performance, this effect diminishes when models are provided with a longer temporal context. Project page: https://germanftv.github.io/DAVIDE.github.io/ .

1.4LGJan 29

ECSEL: Explainable Classification via Signomial Equation Learning

Adia Lumadjeng, Ilker Birbil, Erman Acar

We introduce ECSEL, an explainable classification method that learns formal expressions in the form of signomial equations, motivated by the observation that many symbolic regression benchmarks admit compact signomial structure. ECSEL directly constructs a structural, closed-form expression that serves as both a classifier and an explanation. On standard symbolic regression benchmarks, our method recovers a larger fraction of target equations than competing state-of-the-art approaches while requiring substantially less computation. Leveraging this efficiency, ECSEL achieves classification accuracy competitive with established machine learning models without sacrificing interpretability. Further, we show that ECSEL satisfies some desirable properties regarding global feature behavior, decision-boundary analysis, and local feature attributions. Experiments on benchmark datasets and two real-world case studies i.e., e-commerce and fraud detection, demonstrate that the learned equations expose dataset biases, support counterfactual reasoning, and yield actionable insights.

1.2MAAug 1, 2024

Learning in Multi-Objective Public Goods Games with Non-Linear Utilities

Nicole Orzan, Erman Acar, Davide Grossi et al.

Addressing the question of how to achieve optimal decision-making under risk and uncertainty is crucial for enhancing the capabilities of artificial agents that collaborate with or support humans. In this work, we address this question in the context of Public Goods Games. We study learning in a novel multi-objective version of the Public Goods Game where agents have different risk preferences, by means of multi-objective reinforcement learning. We introduce a parametric non-linear utility function to model risk preferences at the level of individual agents, over the collective and individual reward components of the game. We study the interplay between such preference modelling and environmental uncertainty on the incentive alignment level in the game. We demonstrate how different combinations of individual preferences and environmental uncertainties sustain the emergence of cooperative patterns in non-cooperative environments (i.e., where competitive strategies are dominant), while others sustain competitive patterns in cooperative environments (i.e., where cooperative strategies are dominant).

12.8CVMar 13, 2024

PFStorer: Personalized Face Restoration and Super-Resolution

Tuomas Varanka, Tapani Toivonen, Soumya Tripathy et al.

Recent developments in face restoration have achieved remarkable results in producing high-quality and lifelike outputs. The stunning results however often fail to be faithful with respect to the identity of the person as the models lack necessary context. In this paper, we explore the potential of personalized face restoration with diffusion models. In our approach a restoration model is personalized using a few images of the identity, leading to tailored restoration with respect to the identity while retaining fine-grained details. By using independent trainable blocks for personalization, the rich prior of a base restoration model can be exploited to its fullest. To avoid the model relying on parts of identity left in the conditioning low-quality images, a generative regularizer is employed. With a learnable parameter, the model learns to balance between the details generated based on the input image and the degree of personalization. Moreover, we improve the training pipeline of face restoration models to enable an alignment-free approach. We showcase the robust capabilities of our approach in several real-world scenarios with multiple identities, demonstrating our method's ability to generate fine-grained details with faithful restoration. In the user study we evaluate the perceptual quality and faithfulness of the genereated details, with our method being voted best 61% of the time compared to the second best with 25% of the votes.

5.1MAJan 23, 2024

Emergent Cooperation under Uncertain Incentive Alignment

Nicole Orzan, Erman Acar, Davide Grossi et al.

Understanding the emergence of cooperation in systems of computational agents is crucial for the development of effective cooperative AI. Interaction among individuals in real-world settings are often sparse and occur within a broad spectrum of incentives, which often are only partially known. In this work, we explore how cooperation can arise among reinforcement learning agents in scenarios characterised by infrequent encounters, and where agents face uncertainty about the alignment of their incentives with those of others. To do so, we train the agents under a wide spectrum of environments ranging from fully competitive, to fully cooperative, to mixed-motives. Under this type of uncertainty we study the effects of mechanisms, such as reputation and intrinsic rewards, that have been proposed in the literature to foster cooperation in mixed-motives environments. Our findings show that uncertainty substantially lowers the agents' ability to engage in cooperative behaviour, when that would be the best course of action. In this scenario, the use of effective reputation mechanisms and intrinsic rewards boosts the agents' capability to act nearly-optimally in cooperative environments, while greatly enhancing cooperation in mixed-motive environments as well.

4.2AIApr 17, 2024

CAGE: Causality-Aware Shapley Value for Global Explanations

Nils Ole Breuer, Andreas Sauter, Majid Mohammadi et al.

As Artificial Intelligence (AI) is having more influence on our everyday lives, it becomes important that AI-based decisions are transparent and explainable. As a consequence, the field of eXplainable AI (or XAI) has become popular in recent years. One way to explain AI models is to elucidate the predictive importance of the input features for the AI model in general, also referred to as global explanations. Inspired by cooperative game theory, Shapley values offer a convenient way for quantifying the feature importance as explanations. However many methods based on Shapley values are built on the assumption of feature independence and often overlook causal relations of the features which could impact their importance for the ML model. Inspired by studies of explanations at the local level, we propose CAGE (Causally-Aware Shapley Values for Global Explanations). In particular, we introduce a novel sampling procedure for out-coalition features that respects the causal relations of the input features. We derive a practical approach that incorporates causal knowledge into global explanation and offers the possibility to interpret the predictive feature importance considering their causal relation. We evaluate our method on synthetic data and real-world data. The explanations from our approach suggest that they are not only more intuitive but also more faithful compared to previous global explanation methods.

7.1LGMar 3, 2025

ACTIVA: Amortized Causal Effect Estimation via Transformer-based Variational Autoencoder

Andreas Sauter, Saber Salehkaleybar, Aske Plaat et al.

Predicting the distribution of outcomes under hypothetical interventions is crucial across healthcare, economics, and policy-making. However, existing methods often require restrictive assumptions, and are typically limited by the lack of amortization across problem instances. We propose ACTIVA, a transformer-based conditional variational autoencoder (VAE) architecture for amortized causal inference, which estimates interventional distributions directly from observational data without. ACTIVA learns a latent representation conditioned on observational inputs and intervention queries, enabling zero-shot inference by amortizing causal knowledge from diverse training scenarios. We provide theoretical insights showing that ACTIVA predicts interventional distributions as mixtures over observationally equivalent causal models. Empirical evaluations on synthetic and semi-synthetic datasets confirm the effectiveness of our amortized approach and highlight promising directions for future real-world applications.

2.3RMFeb 14, 2024

On the Potential of Network-Based Features for Fraud Detection

Catayoun Azarm, Erman Acar, Mickey van Zeelt

Online transaction fraud presents substantial challenges to businesses and consumers, risking significant financial losses. Conventional rule-based systems struggle to keep pace with evolving fraud tactics, leading to high false positive rates and missed detections. Machine learning techniques offer a promising solution by leveraging historical data to identify fraudulent patterns. This article explores using the personalised PageRank (PPR) algorithm to capture the social dynamics of fraud by analysing relationships between financial accounts. The primary objective is to compare the performance of traditional features with the addition of PPR in fraud detection models. Results indicate that integrating PPR enhances the model's predictive power, surpassing the baseline model. Additionally, the PPR feature provides unique and valuable information, evidenced by its high feature importance score. Feature stability analysis confirms consistent feature distributions across training and test datasets.

1.4LGFeb 3

Explaining the Explainer: Understanding the Inner Workings of Transformer-based Symbolic Regression Models

Arco van Breda, Erman Acar

Following their success across many domains, transformers have also proven effective for symbolic regression (SR); however, the internal mechanisms underlying their generation of mathematical operators remain largely unexplored. Although mechanistic interpretability has successfully identified circuits in language and vision models, it has not yet been applied to SR. In this article, we introduce PATCHES, an evolutionary circuit discovery algorithm that identifies compact and correct circuits for SR. Using PATCHES, we isolate 28 circuits, providing the first circuit-level characterisation of an SR transformer. We validate these findings through a robust causal evaluation framework based on key notions such as faithfulness, completeness, and minimality. Our analysis shows that mean patching with performance-based evaluation most reliably isolates functionally correct circuits. In contrast, we demonstrate that direct logit attribution and probing classifiers primarily capture correlational features rather than causal ones, limiting their utility for circuit discovery. Overall, these results establish SR as a high-potential application domain for mechanistic interpretability and propose a principled methodology for circuit discovery.

3.3AISep 29, 2025

Successful Misunderstandings: Learning to Coordinate Without Being Understood

Nikolaos Kondylidis, Anil Yaman, Frank van Harmelen et al.

The main approach to evaluating communication is by assessing how well it facilitates coordination. If two or more individuals can coordinate through communication, it is generally assumed that they understand one another. We investigate this assumption in a signaling game where individuals develop a new vocabulary of signals to coordinate successfully. In our game, the individuals do not have common observations besides the communication signal and outcome of the interaction, i.e. received reward. This setting is used as a proxy to study communication emergence in populations of agents that perceive their environment very differently, e.g. hybrid populations that include humans and artificial agents. Agents develop signals, use them, and refine interpretations while not observing how other agents are using them. While populations always converge to optimal levels of coordination, in some cases, interacting agents interpret and use signals differently, converging to what we call successful misunderstandings. However, agents of population that coordinate using misaligned interpretations, are unable to establish successful coordination with new interaction partners. Not leading to coordination failure immediately, successful misunderstandings are difficult to spot and repair. Having at least three agents that all interact with each other are the two minimum conditions to ensure the emergence of shared interpretations. Under these conditions, the agent population exhibits this emergent property of compensating for the lack of shared observations of signal use, ensuring the emergence of shared interpretations.

4.1LGMay 22, 2025

EMERGENT: Efficient and Manipulation-resistant Matching using GFlowNets

Mayesha Tasnim, Erman Acar, Sennay Ghebreab

The design of fair and efficient algorithms for allocating public resources, such as school admissions, housing, or medical residency, has a profound social impact. In one-sided matching problems, where individuals are assigned to items based on ranked preferences, a fundamental trade-off exists between efficiency and strategyproofness. Existing algorithms like Random Serial Dictatorship (RSD), Probabilistic Serial (PS), and Rank Minimization (RM) capture only one side of this trade-off: RSD is strategyproof but inefficient, while PS and RM are efficient but incentivize manipulation. We propose EMERGENT, a novel application of Generative Flow Networks (GFlowNets) to one-sided matching, leveraging its ability to sample diverse, high-reward solutions. In our approach, efficient and manipulation-resistant matches emerge naturally: high-reward solutions yield efficient matches, while the stochasticity of GFlowNets-based outputs reduces incentives for manipulation. Experiments show that EMERGENT outperforms RSD in rank efficiency while significantly reducing strategic vulnerability compared to matches produced by RM and PS. Our work highlights the potential of GFlowNets for applications involving social choice mechanisms, where it is crucial to balance efficiency and manipulability.

2.3AINov 7, 2024Code

Think Smart, Act SMARL! Analyzing Probabilistic Logic Shields for Multi-Agent Reinforcement Learning

Satchit Chatterji, Erman Acar

Safe reinforcement learning (RL) is crucial for real-world applications, and multi-agent interactions introduce additional safety challenges. While Probabilistic Logic Shields (PLS) has been a powerful proposal to enforce safety in single-agent RL, their generalizability to multi-agent settings remains unexplored. In this paper, we address this gap by conducting extensive analyses of PLS within decentralized, multi-agent environments, and in doing so, propose $\textbf{Shielded Multi-Agent Reinforcement Learning (SMARL)}$ as a general framework for steering MARL towards norm-compliant outcomes. Our key contributions are: (1) a novel Probabilistic Logic Temporal Difference (PLTD) update for shielded, independent Q-learning, which incorporates probabilistic constraints directly into the value update process; (2) a probabilistic logic policy gradient method for shielded PPO with formal safety guarantees for MARL; and (3) comprehensive evaluation across symmetric and asymmetrically shielded $n$-player game-theoretic benchmarks, demonstrating fewer constraint violations and significantly better cooperation under normative constraints. These results position SMARL as an effective mechanism for equilibrium selection, paving the way toward safer, socially aligned multi-agent systems.

2.3AINov 1, 2024

Integrating Fuzzy Logic into Deep Symbolic Regression

Wout Gerdes, Erman Acar

Credit card fraud detection is a critical concern for financial institutions, intensified by the rise of contactless payment technologies. While deep learning models offer high accuracy, their lack of explainability poses significant challenges in financial settings. This paper explores the integration of fuzzy logic into Deep Symbolic Regression (DSR) to enhance both performance and explainability in fraud detection. We investigate the effectiveness of different fuzzy logic implications, specifically Łukasiewicz, Gödel, and Product, in handling the complexity and uncertainty of fraud detection datasets. Our analysis suggest that the Łukasiewicz implication achieves the highest F1-score and overall accuracy, while the Product implication offers a favorable balance between performance and explainability. Despite having a performance lower than state-of-the-art (SOTA) models due to information loss in data transformation, our approach provides novelty and insights into into integrating fuzzy logic into DSR for fraud detection, providing a comprehensive comparison between different implications and methods.

1.2RMOct 29, 2024

Differentiable Inductive Logic Programming for Fraud Detection

Boris Wolfson, Erman Acar

Current trends in Machine Learning prefer explainability even when it comes at the cost of performance. Therefore, explainable AI methods are particularly important in the field of Fraud Detection. This work investigates the applicability of Differentiable Inductive Logic Programming (DILP) as an explainable AI approach to Fraud Detection. Although the scalability of DILP is a well-known issue, we show that with some data curation such as cleaning and adjusting the tabular and numerical data to the expected format of background facts statements, it becomes much more applicable. While in processing it does not provide any significant advantage on rather more traditional methods such as Decision Trees, or more recent ones like Deep Symbolic Classification, it still gives comparable results. We showcase its limitations and points to improve, as well as potential use cases where it can be much more useful compared to traditional methods, such as recursive rule learning.

2.3LOJul 1, 2020

Reasoning with Contextual Knowledge and Influence Diagrams

Erman Acar, Rafael Peñaloza

Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity.

17.5AIJun 4, 2020Code

Analyzing Differentiable Fuzzy Implications

Emile van Krieken, Erman Acar, Frank van Harmelen

Combining symbolic and neural approaches has gained considerable attention in the AI community, as it is often argued that the strengths and weaknesses of these approaches are complementary. One such trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, they use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks (or grounding them), this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. In this paper, we investigate how implications from the fuzzy logic literature behave in a differentiable setting. In such a setting, we analyze the differences between the formal properties of these fuzzy implications. It turns out that various fuzzy implications, including some of the most well-known, are highly unsuitable for use in a differentiable learning setting. A further finding shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and show that sigmoidal implications outperform other choices of fuzzy implications.

5.7AIMar 13, 2020

On Sufficient and Necessary Conditions in Bounded CTL: A Forgetting Approach

Renyan Feng, Erman Acar, Stefan Schlobach et al.

Computation Tree Logic (CTL) is one of the central formalisms in formal verification. As a specification language, it is used to express a property that the system at hand is expected to satisfy. From both the verification and the system design points of view, some information content of such property might become irrelevant for the system due to various reasons, e.g., it might become obsolete by time, or perhaps infeasible due to practical difficulties. Then, the problem arises on how to subtract such piece of information without altering the relevant system behaviour or violating the existing specifications over a given signature. Moreover, in such a scenario, two crucial notions are informative: the strongest necessary condition (SNC) and the weakest sufficient condition (WSC) of a given property. To address such a scenario in a principled way, we introduce a forgetting-based approach in CTL and show that it can be used to compute SNC and WSC of a property under a given model and over a given signature. We study its theoretical properties and also show that our notion of forgetting satisfies existing essential postulates of knowledge forgetting. Furthermore, we analyse the computational complexity of some basic reasoning tasks for the fragment CTL_AF in particular.

33.3AIFeb 14, 2020Code

Analyzing Differentiable Fuzzy Logic Operators

Emile van Krieken, Erman Acar, Frank van Harmelen

The AI community is increasingly putting its attention towards combining symbolic and neural approaches, as it is often argued that the strengths and weaknesses of these approaches are complementary. One recent trend in the literature are weakly supervised learning techniques that employ operators from fuzzy logics. In particular, these use prior background knowledge described in such logics to help the training of a neural network from unlabeled and noisy data. By interpreting logical symbols using neural networks, this background knowledge can be added to regular loss functions, hence making reasoning a part of learning. We study, both formally and empirically, how a large collection of logical operators from the fuzzy logic literature behave in a differentiable learning setting. We find that many of these operators, including some of the most well-known, are highly unsuitable in this setting. A further finding concerns the treatment of implication in these fuzzy logics, and shows a strong imbalance between gradients driven by the antecedent and the consequent of the implication. Furthermore, we introduce a new family of fuzzy implications (called sigmoidal implications) to tackle this phenomenon. Finally, we empirically show that it is possible to use Differentiable Fuzzy Logics for semi-supervised learning, and compare how different operators behave in practice. We find that, to achieve the largest performance improvement over a supervised baseline, we have to resort to non-standard combinations of logical operators which perform well in learning, but no longer satisfy the usual logical laws.

18.0AIAug 13, 2019

Semi-Supervised Learning using Differentiable Reasoning

Emile van Krieken, Erman Acar, Frank van Harmelen

We introduce Differentiable Reasoning (DR), a novel semi-supervised learning technique which uses relational background knowledge to benefit from unlabeled data. We apply it to the Semantic Image Interpretation (SII) task and show that background knowledge provides significant improvement. We find that there is a strong but interesting imbalance between the contributions of updates from Modus Ponens (MP) and its logical equivalent Modus Tollens (MT) to the learning process, suggesting that our approach is very sensitive to a phenomenon called the Raven Paradox. We propose a solution to overcome this situation.