Marcel Gehrke

h-index9

19papers

56citations

Novelty49%

AI Score43

Ranked #82,085 of 201,326 authors (top 41%)#5,136 in AI (top 36%)

19 Papers

AIMay 26

On the Detection of Commutative Factors in Factor Graphs: Necessary and Sufficient Conditions

Malte Luttermann, Ralf Möller, Marcel Gehrke

Exploiting the indistinguishability of objects in a probabilistic graphical model such as a factor graph is key to lifted probabilistic inference algorithms and allows for tractable probabilistic inference problems with respect to domain sizes. A central building block for the exploitation of indistinguishable objects in factor graphs is the identification of commutative factors, i.e., factors whose output values are invariant under permutations of input values assigned to a subset of their arguments. In this paper, we revisit the theoretical foundations underlying the state-of-the-art algorithm to detect commutative factors. Specifically, we show that in its current form, the state-of-the-art algorithm relies on a central theorem that is mistakenly regarded as a sufficient condition to identify commutative factors, while it actually only implies necessary condition. Consequently, the state of the art might, as we show in this paper, deliver incorrect results. To fix the flaws currently present in the state of the art, we prove a slightly modified version of the aforementioned theorem, which serves as a necessary condition to identify commutative factors. Moreover, we present a corrected version of the state-of-the-art algorithm, which keeps its efficiency while ensuring correctness and introduce a complementary algorithm with tighter worst-case bounds.

AISep 20, 2023

Colour Passing Revisited: Lifted Model Construction with Commutative Factors

Malte Luttermann, Tanya Braun, Ralf Möller et al.

Lifted probabilistic inference exploits symmetries in a probabilistic model to allow for tractable probabilistic inference with respect to domain sizes. To apply lifted inference, a lifted representation has to be obtained, and to do so, the so-called colour passing algorithm is the state of the art. The colour passing algorithm, however, is bound to a specific inference algorithm and we found that it ignores commutativity of factors while constructing a lifted representation. We contribute a modified version of the colour passing algorithm that uses logical variables to construct a lifted representation independent of a specific inference algorithm while at the same time exploiting commutativity of factors during an offline-step. Our proposed algorithm efficiently detects more symmetries than the state of the art and thereby drastically increases compression, yielding significantly faster online query times for probabilistic inference when the resulting model is applied.

CRJul 6, 2023

DPM: Clustering Sensitive Data through Separation

Johannes Liebenow, Yara Schütt, Tanya Braun et al.

Clustering is an important tool for data exploration where the goal is to subdivide a data set into disjoint clusters that fit well into the underlying data structure. When dealing with sensitive data, privacy-preserving algorithms aim to approximate the non-private baseline while minimising the leakage of sensitive information. State-of-the-art privacy-preserving clustering algorithms tend to output clusters that are good in terms of the standard metrics, inertia, silhouette score, and clustering accuracy, however, the clustering result strongly deviates from the non-private KMeans baseline. In this work, we present a privacy-preserving clustering algorithm called DPM that recursively separates a data set into clusters based on a geometrical clustering approach. In addition, DPM estimates most of the data-dependent hyper-parameters in a privacy-preserving way. We prove that DPM preserves Differential Privacy and analyse the utility guarantees of DPM. Finally, we conduct an extensive empirical evaluation for synthetic and real-life data sets. We show that DPM achieves state-of-the-art utility on the standard clustering metrics and yields a clustering result much closer to that of the popular non-private KMeans algorithm without requiring the number of classes.

AIJul 23, 2024

Efficient Detection of Commutative Factors in Factor Graphs

Malte Luttermann, Johann Machemer, Marcel Gehrke

Lifted probabilistic inference exploits symmetries in probabilistic graphical models to allow for tractable probabilistic inference with respect to domain sizes. To exploit symmetries in, e.g., factor graphs, it is crucial to identify commutative factors, i.e., factors having symmetries within themselves due to their arguments being exchangeable. The current state of the art to check whether a factor is commutative with respect to a subset of its arguments iterates over all possible subsets of the factor's arguments, i.e., $O(2^n)$ iterations for a factor with $n$ arguments in the worst case. In this paper, we efficiently solve the problem of detecting commutative factors in a factor graph. In particular, we introduce the detection of commutative factors (DECOR) algorithm, which allows us to drastically reduce the computational effort for checking whether a factor is commutative in practice. We prove that DECOR efficiently identifies restrictions to drastically reduce the number of required iterations and validate the efficiency of DECOR in our empirical evaluation.

AIMar 15, 2024

Efficient Detection of Exchangeable Factors in Factor Graphs

Malte Luttermann, Johann Machemer, Marcel Gehrke

To allow for tractable probabilistic inference with respect to domain sizes, lifted probabilistic inference exploits symmetries in probabilistic graphical models. However, checking whether two factors encode equivalent semantics and hence are exchangeable is computationally expensive. In this paper, we efficiently solve the problem of detecting exchangeable factors in a factor graph. In particular, we introduce the detection of exchangeable factors (DEFT) algorithm, which allows us to drastically reduce the computational effort for checking whether two factors are exchangeable in practice. While previous approaches iterate all $O(n!)$ permutations of a factor's argument list in the worst case (where $n$ is the number of arguments of the factor), we prove that DEFT efficiently identifies restrictions to drastically reduce the number of permutations and validate the efficiency of DEFT in our empirical evaluation.

AIMar 15, 2024

Lifted Causal Inference in Relational Domains

Malte Luttermann, Mattis Hartwig, Tanya Braun et al.

Lifted inference exploits symmetries in probabilistic graphical models by using a representative for indistinguishable objects, thereby speeding up query answering while maintaining exact answers. Even though lifting is a well-established technique for the task of probabilistic inference in relational domains, it has not yet been applied to the task of causal inference. In this paper, we show how lifting can be applied to efficiently compute causal effects in relational domains. More specifically, we introduce parametric causal factor graphs as an extension of parametric factor graphs incorporating causal knowledge and give a formal semantics of interventions therein. We further present the lifted causal inference algorithm to compute causal effects on a lifted level, thereby drastically speeding up causal inference compared to propositional inference, e.g., in causal Bayesian networks. In our empirical evaluation, we demonstrate the effectiveness of our approach.

AIApr 5, 2025

Lifting Factor Graphs with Some Unknown Factors for New Individuals

Malte Luttermann, Ralf Möller, Marcel Gehrke

Lifting exploits symmetries in probabilistic graphical models by using a representative for indistinguishable objects, allowing to carry out query answering more efficiently while maintaining exact answers. In this paper, we investigate how lifting enables us to perform probabilistic inference for factor graphs containing unknown factors, i.e., factors whose underlying function of potential mappings is unknown. We present the Lifting Factor Graphs with Some Unknown Factors (LIFAGU) algorithm to identify indistinguishable subgraphs in a factor graph containing unknown factors, thereby enabling the transfer of known potentials to unknown potentials to ensure a well-defined semantics of the model and allow for (lifted) probabilistic inference. We further extend LIFAGU to incorporate additional background knowledge about groups of factors belonging to the same individual object. By incorporating such background knowledge, LIFAGU is able to further reduce the ambiguity of possible transfers of known potentials to unknown potentials.

AINov 18, 2024

Lifted Model Construction without Normalisation: A Vectorised Approach to Exploit Symmetries in Factor Graphs

Malte Luttermann, Ralf Möller, Marcel Gehrke

Lifted probabilistic inference exploits symmetries in a probabilistic model to allow for tractable probabilistic inference with respect to domain sizes of logical variables. We found that the current state-of-the-art algorithm to construct a lifted representation in form of a parametric factor graph misses symmetries between factors that are exchangeable but scaled differently, thereby leading to a less compact representation. In this paper, we propose a generalisation of the advanced colour passing (ACP) algorithm, which is the state of the art to construct a parametric factor graph. Our proposed algorithm allows for potentials of factors to be scaled arbitrarily and efficiently detects more symmetries than the original ACP algorithm. By detecting strictly more symmetries than ACP, our algorithm significantly reduces online query times for probabilistic inference when the resulting model is applied, which we also confirm in our experiments.

AINov 11, 2024

Estimating Causal Effects in Partially Directed Parametric Causal Factor Graphs

Malte Luttermann, Tanya Braun, Ralf Möller et al.

Lifting uses a representative of indistinguishable individuals to exploit symmetries in probabilistic relational models, denoted as parametric factor graphs, to speed up inference while maintaining exact answers. In this paper, we show how lifting can be applied to causal inference in partially directed graphs, i.e., graphs that contain both directed and undirected edges to represent causal relationships between random variables. We present partially directed parametric causal factor graphs (PPCFGs) as a generalisation of previously introduced parametric causal factor graphs, which require a fully directed graph. We further show how causal inference can be performed on a lifted level in PPCFGs, thereby extending the applicability of lifted causal inference to a broader range of models requiring less prior knowledge about causal relationships.

IRApr 30, 2024

Enhancement of Subjective Content Descriptions by using Human Feedback

Magnus Bender, Tanya Braun, Ralf Möller et al.

An agent providing an information retrieval service may work with a corpus of text documents. The documents in the corpus may contain annotations such as Subjective Content Descriptions (SCD) -- additional data associated with different sentences of the documents. Each SCD is associated with multiple sentences of the corpus and has relations among each other. The agent uses the SCDs to create its answers in response to queries supplied by users. However, the SCD the agent uses might reflect the subjective perspective of another user. Hence, answers may be considered faulty by an agent's user, because the SCDs may not exactly match the perceptions of an agent's user. A naive and very costly approach would be to ask each user to completely create all the SCD themselves. To use existing knowledge, this paper presents ReFrESH, an approach for Relation-preserving Feedback-reliant Enhancement of SCDs by Humans. An agent's user can give feedback about faulty answers to the agent. This feedback is then used by ReFrESH to update the SCDs incrementally. However, human feedback is not always unambiguous. Therefore, this paper additionally presents an approach to decide how to incorporate the feedback and when to update the SCDs. Altogether, SCDs can be updated with human feedback, allowing users to create even more specific SCDs for their needs.

AIMay 28, 2025

Compression versus Accuracy: A Hierarchy of Lifted Models

Jan Speller, Malte Luttermann, Marcel Gehrke et al.

Probabilistic graphical models that encode indistinguishable objects and relations among them use first-order logic constructs to compress a propositional factorised model for more efficient (lifted) inference. To obtain a lifted representation, the state-of-the-art algorithm Advanced Colour Passing (ACP) groups factors that represent matching distributions. In an approximate version using $\varepsilon$ as a hyperparameter, factors are grouped that differ by a factor of at most $(1\pm \varepsilon)$. However, finding a suitable $\varepsilon$ is not obvious and may need a lot of exploration, possibly requiring many ACP runs with different $\varepsilon$ values. Additionally, varying $\varepsilon$ can yield wildly different models, leading to decreased interpretability. Therefore, this paper presents a hierarchical approach to lifted model construction that is hyperparameter-free. It efficiently computes a hierarchy of $\varepsilon$ values that ensures a hierarchy of models, meaning that once factors are grouped together given some $\varepsilon$, these factors will be grouped together for larger $\varepsilon$ as well. The hierarchy of $\varepsilon$ values also leads to a hierarchy of error bounds. This allows for explicitly weighing compression versus accuracy when choosing specific $\varepsilon$ values to run ACP with and enables interpretability between the different models.

AIApr 29, 2025

Approximate Lifted Model Construction

Malte Luttermann, Jan Speller, Marcel Gehrke et al.

Probabilistic relational models such as parametric factor graphs enable efficient (lifted) inference by exploiting the indistinguishability of objects. In lifted inference, a representative of indistinguishable objects is used for computations. To obtain a relational (i.e., lifted) representation, the Advanced Colour Passing (ACP) algorithm is the state of the art. The ACP algorithm, however, requires underlying distributions, encoded as potential-based factorisations, to exactly match to identify and exploit indistinguishabilities. Hence, ACP is unsuitable for practical applications where potentials learned from data inevitably deviate even if associated objects are indistinguishable. To mitigate this problem, we introduce the $\varepsilon$-Advanced Colour Passing ($\varepsilon$-ACP) algorithm, which allows for a deviation of potentials depending on a hyperparameter $\varepsilon$. $\varepsilon$-ACP efficiently uncovers and exploits indistinguishabilities that are not exact. We prove that the approximation error induced by $\varepsilon$-ACP is strictly bounded and our experiments show that the approximation error is close to zero in practice.

LGJun 9, 2025

Denoising the Future: Top-p Distributions for Moving Through Time

Florian Andreas Marwitz, Ralf Möller, Magnus Bender et al.

Inference in dynamic probabilistic models is a complex task involving expensive operations. In particular, for Hidden Markov Models, the whole state space has to be enumerated for advancing in time. Even states with negligible probabilities are considered, resulting in computational inefficiency and increased noise due to the propagation of unlikely probability mass. We propose to denoise the future and speed up inference by using only the top-p states, i.e., the most probable states with accumulated probability p. We show that the error introduced by using only the top-p states is bound by p and the so-called minimal mixing rate of the underlying model. Moreover, in our empirical evaluation, we show that we can expect speedups of at least an order of magnitude, while the error in terms of total variation distance is below 0.09.

AIMay 28, 2025

Lifted Forward Planning in Relational Factored Markov Decision Processes with Concurrent Actions

Florian Andreas Marwitz, Tanya Braun, Ralf Möller et al.

Decision making is a central problem in AI that can be formalized using a Markov Decision Process. A problem is that, with increasing numbers of (indistinguishable) objects, the state space grows exponentially. To compute policies, the state space has to be enumerated. Even more possibilities have to be enumerated if the size of the action space depends on the size of the state space, especially if we allow concurrent actions. To tackle the exponential blow-up in the action and state space, we present a first-order representation to store the spaces in polynomial instead of exponential size in the number of objects and introduce Foreplan, a relational forward planner, which uses this representation to efficiently compute policies for numerous indistinguishable objects and actions. Additionally, we introduce an even faster approximate version of Foreplan. Moreover, Foreplan identifies how many objects an agent should act on to achieve a certain task given restrictions. Further, we provide a theoretical analysis and an empirical evaluation of Foreplan, demonstrating a speedup of at least four orders of magnitude.

AIJun 3, 2024

Lifting Factor Graphs with Some Unknown Factors

Malte Luttermann, Ralf Möller, Marcel Gehrke

AIOct 18, 2021

On the Completeness and Complexity of the Lifted Dynamic Junction Tree Algorithm

Marcel Gehrke

For static lifted inference algorithms, completeness, i.e., domain liftability, is extensively studied. However, so far no domain liftability results for temporal lifted inference algorithms exist. In this paper, we close this gap. More precisely, we contribute the first completeness and complexity analysis for a temporal lifted algorithm, the socalled lifted dynamic junction tree algorithm (LDJT), which is the only exact lifted temporal inference algorithm out there. To handle temporal aspects efficiently, LDJT uses conditional independences to proceed in time, leading to restrictions w.r.t. elimination orders. We show that these restrictions influence the domain liftability results and show that one particular case while proceeding in time, has to be excluded from FO12 . Additionally, for the complexity of LDJT, we prove that the lifted width is in even more cases smaller than the corresponding treewidth in comparison to static inference.

AINov 16, 2019

Taming Reasoning in Temporal Probabilistic Relational Models

Marcel Gehrke, Ralf Möller, Tanya Braun

Evidence often grounds temporal probabilistic relational models over time, which makes reasoning infeasible. To counteract groundings over time and to keep reasoning polynomial by restoring a lifted representation, we present temporal approximate merging (TAMe), which incorporates (i) clustering for grouping submodels as well as (ii) statistical significance checks to test the fitness of the clustering outcome. In exchange for faster runtimes, TAMe introduces a bounded error that becomes negligible over time. Empirical results show that TAMe significantly improves the runtime performance of inference, while keeping errors small.

AIJul 2, 2018

Answering Hindsight Queries with Lifted Dynamic Junction Trees

Marcel Gehrke, Tanya Braun, Ralf Möller

The lifted dynamic junction tree algorithm (LDJT) efficiently answers filtering and prediction queries for probabilistic relational temporal models by building and then reusing a first-order cluster representation of a knowledge base for multiple queries and time steps. We extend LDJT to (i) solve the smoothing inference problem to answer hindsight queries by introducing an efficient backward pass and (ii) discuss different options to instantiate a first-order cluster representation during a backward pass. Further, our relational forward backward algorithm makes hindsight queries to the very beginning feasible. LDJT answers multiple temporal queries faster than the static lifted junction tree algorithm on an unrolled model, which performs smoothing during message passing.

AIJul 2, 2018

Preventing Unnecessary Groundings in the Lifted Dynamic Junction Tree Algorithm

Marcel Gehrke, Tanya Braun, Ralf Möller

The lifted dynamic junction tree algorithm (LDJT) efficiently answers filtering and prediction queries for probabilistic relational temporal models by building and then reusing a first-order cluster representation of a knowledge base for multiple queries and time steps. Unfortunately, a non-ideal elimination order can lead to groundings even though a lifted run is possible for a model. We extend LDJT (i) to identify unnecessary groundings while proceeding in time and (ii) to prevent groundings by delaying eliminations through changes in a temporal first-order cluster representation. The extended version of LDJT answers multiple temporal queries orders of magnitude faster than the original version.