Michael Hoffmann

CL
h-index3
5papers
14citations
Novelty36%
AI Score40

5 Papers

COMar 30
A Gray code for arborescences of tournaments

Marthe Bonamy, Michael Hoffmann, Clément Legrand-Duchesne et al.

We consider the following question of Knuth: given a directed graph $G$ and a root $r$, can the arborescences of $G$ rooted in $r$ be listed such that any two consecutive arborescences differ by only one arc? Such an ordering is called a pivot Gray code and can be formulated as a Hamiltonian path in the reconfiguration graph of the arborescences of $G$ under arc flips, also called flip graph of $G$. We give a positive answer for tournaments and explore several conditions showing that the flip graph of a directed graph may contain no Hamiltonian cycles.

CYJul 18, 2024
Report on the Conference on Ethical and Responsible Design in the National AI Institutes: A Summary of Challenges

Sherri Lynn Conklin, Sue Bae, Gaurav Sett et al.

In May 2023, the Georgia Tech Ethics, Technology, and Human Interaction Center organized the Conference on Ethical and Responsible Design in the National AI Institutes. Representatives from the National AI Research Institutes that had been established as of January 2023 were invited to attend; researchers representing 14 Institutes attended and participated. The conference focused on three questions: What are the main challenges that the National AI Institutes are facing with regard to the responsible design of AI systems? What are promising lines of inquiry to address these challenges? What are possible points of collaboration? Over the course of the conference, a revised version of the first question became a focal point: What are the challenges that the Institutes face in identifying ethical and responsible design practices and in implementing them in the AI development process? This document summarizes the challenges that representatives from the Institutes in attendance highlighted.

HCNov 10, 2025
Designing and Evaluating Malinowski's Lens: An AI-Native Educational Game for Ethnographic Learning

Michael Hoffmann, Jophin John, Jan Fillies et al.

This study introduces 'Malinowski's Lens', the first AI-native educational game for anthropology that transforms Bronislaw Malinowski's 'Argonauts of the Western Pacific' (1922) into an interactive learning experience. The system combines Retrieval-Augmented Generation with DALL-E 3 text-to-image generation, creating consistent VGA-style visuals as players embody Malinowski during his Trobriand Islands fieldwork (1915-1918). To address ethical concerns, indigenous peoples appear as silhouettes while Malinowski is detailed, prompting reflection on anthropological representation. Two validation studies confirmed effectiveness: Study 1 with 10 non-specialists showed strong learning outcomes (average quiz score 7.5/10) and excellent usability (SUS: 83/100). Study 2 with 4 expert anthropologists confirmed pedagogical value, with one senior researcher discovering "new aspects" of Malinowski's work through gameplay. The findings demonstrate that AI-driven educational games can effectively convey complex anthropological concepts while sparking disciplinary curiosity. This study advances AI-native educational game design and provides a replicable model for transforming academic texts into engaging interactive experiences.

CLSep 6, 2025
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian

Michael Hoffmann, Jophin John, Stefan Schweter et al.

We present Llama-GENBA-10B, a trilingual foundation model addressing English-centric bias in large language models. Built on Llama 3.1-8B and scaled to 10B parameters, Llama-GENBA-10B is continuously pretrained on 164B tokens (82B English, 82B German, and 80M Bavarian), balancing resources while preventing English dominance. Targeted at the German NLP community, the model also promotes Bavarian as a low-resource language. Development tackled four challenges: (1) curating a multilingual corpus despite Bavarian scarcity, (2) creating a unified tokenizer for English, German, and Bavarian, (3) optimizing architecture and language-ratio hyperparameters for cross-lingual transfer, and (4) establishing the first standardized trilingual evaluation suite by translating German benchmarks into Bavarian. Evaluations show that Llama-GENBA-10B achieves strong cross-lingual performance, with the fine-tuned variant surpassing Apertus-8B-2509 and gemma-2-9b in Bavarian and establishing itself as the best model in its class for this language, while also outperforming EuroLLM in English and matching its results in German. Training on the Cerebras CS-2 demonstrated efficient large-scale multilingual pretraining with documented energy use, offering a blueprint for inclusive foundation models that integrate low-resource languages.

IROct 24, 2017
Using Multi-Label Classification for Improved Question Answering

Ricardo Usbeck, Michael Hoffmann, Michael Röder et al.

A plethora of diverse approaches for question answering over RDF data have been developed in recent years. While the accuracy of these systems has increased significantly over time, most systems still focus on particular types of questions or particular challenges in question answering. What is a curse for single systems is a blessing for the combination of these systems. We show in this paper how machine learning techniques can be applied to create a more accurate question answering metasystem by reusing existing systems. In particular, we develop a multi-label classification-based metasystem for question answering over 6 existing systems using an innovative set of 14 question features. The metasystem outperforms the best single system by 14% F-measure on the recent QALD-6 benchmark. Furthermore, we analyzed the influence and correlation of the underlying features on the metasystem quality.