Thomas Zeume

4papers

1citation

Novelty54%

AI Score46

Ranked #61,322 of 201,326 authors (top 30%)#12 in FL (top 17%)

4 Papers

LOApr 24

Dynamic Planar Graph Isomorphism is in DynFO

Samir Datta, Asif Khan, Felix Tschirbs et al.

Consider two planar graphs which are subject to edge insertions and deletions. We show that whether the two graphs are isomorphic can be maintained with first-order logic formulas and auxiliary data of polynomial size. This places the dynamic planar graph isomorphism problem into the dynamic descriptive complexity class DynFO. As a consequence, there is a dynamic constant-time parallel algorithm with polynomial-size auxiliary data which maintains whether two dynamic planar graphs are isomorphic.

FLApr 8

Detecting and Explaining (In-)equivalence of Context-Free Grammars

Marko Schmellenkamp, Thomas Zeume, Sven Argo et al.

We propose a scalable framework for deciding, proving, and explaining (in-)equivalence of context-free grammars. We present an implementation of the framework and evaluate it on large data sets collected within educational support systems. Even though the equivalence problem for context-free languages is undecidable in general, the framework is able to handle a large portion of these datasets. It introduces and combines techniques from several areas, such as an abstract grammar transformation language to identify equivalent grammars as well as sufficiently similar inequivalent grammars, theory-based comparison algorithms for a large class of context-free languages, and a graph-theory-inspired grammar canonization that allows to efficiently identify isomorphic grammars.

FLFeb 23

NILE: Formalizing Natural-Language Descriptions of Formal Languages

Tristan Kneisel, Marko Schmellenkamp, Fabian Vehlken et al.

This paper explores how natural-language descriptions of formal languages can be compared to their formal representations and how semantic differences can be explained. This is motivated from educational scenarios where learners describe a formal language (presented, e.g., by a finite state automaton, regular expression, pushdown automaton, context-free grammar or in set notation) in natural language, and an educational support system has to (1) judge whether the natural-language description accurately describes the formal language, and to (2) provide explanations why descriptions are not accurate. To address this question, we introduce a representation language for formal languages, Nile, which is designed so that Nile expressions can mirror the syntactic structure of natural-language descriptions of formal languages. Nile is sufficiently expressive to cover a broad variety of formal languages, including all regular languages and fragments of context-free languages typically used in educational contexts. Generating Nile expressions that are syntactically close to natural-language descriptions then allows to provide explanations for inaccuracies in the descriptions algorithmically. In experiments on an educational data set, we show that LLMs can translate natural-language descriptions into equivalent, syntactically close Nile expressions with high accuracy - allowing to algorithmically provide explanations for incorrect natural-language descriptions. Our experiments also show that while natural-language descriptions can also be translated into regular expressions (but not context-free grammars), the expressions are often not syntactically close and thus not suitable for providing explanations.

CYApr 14

Cross-Course Generalizability of SRL-Aligned Predictive Models Using Digital Learning Traces

Jakob Schwerter, Loreen Sabel, Judith Bose et al.

STEM dropout rates remain high at universities, particularly in computer science programs with theory-intensive courses. Digital learning environments now capture rich behavioral data that could help identify struggling students early, yet the generalizability of data-driven prediction models across courses and institutions remains uncertain. Guided by self-regulated learning (SRL) theory, this study analyzed multimodal digital-trace data from three undergraduate theoretical computer science courses (N1 = 137, N2 = 104, N3 = 148) at two universities. Weekly SRL-aligned digital-trace indicators were modeled using Elastic Net, Random Forest, and XGBoost to evaluate predictive performance over time and across settings, and model calibration both within and across courses. Early prediction of at-risk students was feasible, with SRL-related behaviors such as time management, effort regulation, and sustained engagement emerging as key predictors. While Random Forest achieved the highest in-sample accuracy, Elastic Net generalized more robustly across contexts. Out-of-sample accuracy and calibration declined between institutions with different base rates, underscoring the contextual nature of predictive analytics in higher education. These findings suggest that digital learning traces enable early identification of at-risk students within courses, but generalizing predictive models beyond their original context requires caution, particularly if the at-risk rates differ between contexts.