LGJan 10, 2023
schlably: A Python Framework for Deep Reinforcement Learning Based Scheduling ExperimentsConstantin Waubert de Puiseau, Jannik Peters, Christian Dörpelkus et al.
Research on deep reinforcement learning (DRL) based production scheduling (PS) has gained a lot of attention in recent years, primarily due to the high demand for optimizing scheduling problems in diverse industry settings. Numerous studies are carried out and published as stand-alone experiments that often vary only slightly with respect to problem setups and solution approaches. The programmatic core of these experiments is typically very similar. Despite this fact, no standardized and resilient framework for experimentation on PS problems with DRL algorithms could be established so far. In this paper, we introduce schlably, a Python-based framework that provides researchers a comprehensive toolset to facilitate the development of PS solution strategies based on DRL. schlably eliminates the redundant overhead work that the creation of a sturdy and flexible backbone requires and increases the comparability and reusability of conducted research work.
MASep 4, 2024
Emergent Language: A Survey and TaxonomyJannik Peters, Constantin Waubert de Puiseau, Hasan Tercan et al.
The field of emergent language represents a novel area of research within the domain of artificial intelligence, particularly within the context of multi-agent reinforcement learning. Although the concept of studying language emergence is not new, early approaches were primarily concerned with explaining human language formation, with little consideration given to its potential utility for artificial agents. In contrast, studies based on reinforcement learning aim to develop communicative capabilities in agents that are comparable to or even superior to human language. Thus, they extend beyond the learned statistical representations that are common in natural language processing research. This gives rise to a number of fundamental questions, from the prerequisites for language emergence to the criteria for measuring its success. This paper addresses these questions by providing a comprehensive review of 181 scientific publications on emergent language in artificial intelligence. Its objective is to serve as a reference for researchers interested in or proficient in the field. Consequently, the main contributions are the definition and overview of the prevailing terminology, the analysis of existing evaluation methods and metrics, and the description of the identified research gaps.
32.1GTMay 6
An Axiomatic Analysis of Proportionality Notions in Approval-Based Multiwinner VotingChris Dong, Jannik Peters
Even though proportional representation is a fundamental goal in multiwinner voting and a plethora of proportionality notions has been introduced, the normative justifications for choosing one notion over another remain poorly understood. We address this by introducing the axiomatic study of proportionality notions in the approval-based multiwinner voting setting. That is, we define axioms (or desirable properties) that ``good'' proportionality notions should possess. Using these axioms, we then provide axiomatic characterizations of two prominent recently introduced notions: PJR+ and EJR+ [Brill and Peters 2023]. Our characterization proceeds in two parts. Firstly, we provide a characterization of refinements of PJR+ and EJR+. That is, we define axioms such that any notion satisfying these axioms must imply PJR+ (or EJR+, respectively). In particular, the fundamental axiom distinguishing PJR+ and EJR+ from their predecessors PJR and EJR is the classical axiom of monotonicity. Secondly, we introduce our framework of witness-based proportionality notions, that is, proportionality notions that certify ``misrepresentation'' via a witness set of misrepresented voters. In this class, we provide characterizations of PJR+ and EJR+ as the strongest (assuming certain axioms). Thus, by putting both directions together we obtain exact characterizations of both notions. Among our results, it may be worth highlighting that any notion satisfying mild conditions (monotonicity, independence of losers, robustness to fully satisfied voters, and lower quota) refines PJR+. In this sense, PJR+ turns out to be the canonical minimal requirement that one may impose on proportionality.
LGOct 27, 2023
Proportional Fairness in Clustering: A Social Choice PerspectiveLeon Kellerhals, Jannik Peters
We study the proportional clustering problem of Chen et al. [ICML'19] and relate it to the area of multiwinner voting in computational social choice. We show that any clustering satisfying a weak proportionality notion of Brill and Peters [EC'23] simultaneously obtains the best known approximations to the proportional fairness notion of Chen et al. [ICML'19], but also to individual fairness [Jung et al., FORC'20] and the "core" [Li et al. ICML'21]. In fact, we show that any approximation to proportional fairness is also an approximation to individual fairness and vice versa. Finally, we also study stronger notions of proportional representation, in which deviations do not only happen to single, but multiple candidate centers, and show that stronger proportionality notions of Brill and Peters [EC'23] imply approximations to these stronger guarantees.
AISep 4, 2024
Decision Transformer for Enhancing Neural Local Search on the Job Shop Scheduling ProblemConstantin Waubert de Puiseau, Fabian Wolz, Merlin Montag et al.
The job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades. In recent years, machine learning (ML) is playing an increasingly important role in advancing existing and building new heuristic solutions for the JSSP, aiming to find better solutions in shorter computation times. In this paper we build on top of a state-of-the-art deep reinforcement learning (DRL) agent, called Neural Local Search (NLS), which can efficiently and effectively control a large local neighborhood search on the JSSP. In particular, we develop a method for training the decision transformer (DT) algorithm on search trajectories taken by a trained NLS agent to further improve upon the learned decision-making sequences. Our experiments show that the DT successfully learns local search strategies that are different and, in many cases, more effective than those of the NLS agent itself. In terms of the tradeoff between solution quality and acceptable computational time needed for the search, the DT is particularly superior in application scenarios where longer computational times are acceptable. In this case, it makes up for the longer inference times required per search step, which are caused by the larger neural network architecture, through better quality decisions per step. Thereby, the DT achieves state-of-the-art results for solving the JSSP with ML-enhanced search.
57.2GTApr 27
Explanation Systems for Approval-Based Multiwinner VotingNiclas Boehmer, Luca Kreisel, Jannik Peters
In approval-based multiwinner voting, voters express approval preferences over a set of candidates, and the goal is to return a winning committee. This model captures a broad range of subset selection problems under preferences. Prior work has focused on the study of binary proportionality axioms that certify whether a given committee is proportionally representative or not. We take a more fine-grained perspective and initiate the study of explanation systems that quantify how a committee represents the electorate, i.e., how much influence each voter exerts, how this influence is allocated across selected candidates, how each candidate is backed by the voters, and why certain candidates were not chosen. Building on the notion of priceability, we propose price systems as a framework for such explanations. A price system assigns each voter an individual budget, which they can spend on selected candidates they approve, and each candidate needs to be purchased at a unit price. Since many price systems can exist for a given outcome, selecting among them requires care. We initiate an axiomatic study of price systems and propose several axioms capturing structural coherence, faithful attribution of influence, and alignment with proportionality. On the algorithmic side, we introduce a polynomial-time computable rule in which voters continuously gain and exercise influence and show that it satisfies all jointly satisfiable axioms. Experiments on synthetic and real-world instances indicate that our explanations correlate with established proportionality notions and can recover unequal influence when it is present.
GTMay 30, 2025
Online Fair Division with Additional InformationTzeh Yuan Neoh, Jannik Peters, Nicholas Teh
We study the problem of fairly allocating indivisible goods to agents in an online setting, where goods arrive sequentially and must be allocated irrevocably to agents. Focusing on the popular fairness notions of envy-freeness, proportionality, and maximin share fairness (and their approximate variants), we ask how the availability of information on future goods influences the existence and approximability of fair allocations. In the absence of any such information, we establish strong impossibility results, demonstrating the inherent difficulty of achieving even approximate fairness guarantees. In contrast, we demonstrate that knowledge of additional information -- such as aggregate of each agent's total valuations (equivalently, normalized valuations) or the multiset of future goods values (frequency predictions) -- would enable the design of fairer online algorithms. Given normalization information, we propose an algorithm that achieves stronger fairness guarantees than previously known results. Given frequency predictions, we introduce a meta-algorithm that leverages frequency predictions to match the best-known offline guarantees for a broad class of ''share-based'' fairness notions. Our complementary impossibility results in each setting underscore both the limitations imposed by uncertainty about future goods and the potential of leveraging structured information to achieve fairer outcomes in online fair division.
LGNov 24, 2025
The Core in Max-Loss Non-Centroid Clustering Can Be EmptyRobert Bredereck, Eva Deltl, Leon Kellerhals et al.
We study core stability in non-centroid clustering under the max-loss objective, where each agent's loss is the maximum distance to other members of their cluster. We prove that for all $k\geq 3$ there exist metric instances with $n\ge 9$ agents, with $n$ divisible by $k$, for which no clustering lies in the $α$-core for any $α<2^{\frac{1}{5}}\sim 1.148$. The bound is tight for our construction. Using a computer-aided proof, we also identify a two-dimensional Euclidean point set whose associated lower bound is slightly smaller than that of our general construction. This is, to our knowledge, the first impossibility result showing that the core can be empty in non-centroid clustering under the max-loss objective.
GTFeb 9, 2025
Verifying Proportionality in Temporal VotingEdith Elkind, Svetlana Obraztsova, Jannik Peters et al.
We study a model of temporal voting where there is a fixed time horizon, and at each round the voters report their preferences over the available candidates and a single candidate is selected. Prior work has adapted popular notions of justified representation as well as voting rules that provide strong representation guarantees from the multiwinner election setting to this model. In our work, we focus on the complexity of verifying whether a given outcome offers proportional representation. We show that in the temporal setting verification is strictly harder than in multiwinner voting, but identify natural special cases that enable efficient algorithms.
AIJun 11, 2024
Beyond Training: Optimizing Reinforcement Learning Based Job Shop Scheduling Through Adaptive Action SamplingConstantin Waubert de Puiseau, Christian Dörpelkus, Jannik Peters et al.
Learned construction heuristics for scheduling problems have become increasingly competitive with established solvers and heuristics in recent years. In particular, significant improvements have been observed in solution approaches using deep reinforcement learning (DRL). While much attention has been paid to the design of network architectures and training algorithms to achieve state-of-the-art results, little research has investigated the optimal use of trained DRL agents during inference. Our work is based on the hypothesis that, similar to search algorithms, the utilization of trained DRL agents should be dependent on the acceptable computational budget. We propose a simple yet effective parameterization, called $δ$-sampling that manipulates the trained action vector to bias agent behavior towards exploration or exploitation during solution construction. By following this approach, we can achieve a more comprehensive coverage of the search space while still generating an acceptable number of solutions. In addition, we propose an algorithm for obtaining the optimal parameterization for such a given number of solutions and any given trained agent. Experiments extending existing training protocols for job shop scheduling problems with our inference method validate our hypothesis and result in the expected improvements of the generated solutions.
LGOct 15, 2020
Maps for Learning Indexable ClassesJulian Berger, Maximilian Böther, Vanja Doskoč et al.
We study learning of indexed families from positive data where a learner can freely choose a hypothesis space (with uniformly decidable membership) comprising at least the languages to be learned. This abstracts a very universal learning task which can be found in many areas, for example learning of (subsets of) regular languages or learning of natural languages. We are interested in various restrictions on learning, such as consistency, conservativeness or set-drivenness, exemplifying various natural learning restrictions. Building on previous results from the literature, we provide several maps (depictions of all pairwise relations) of various groups of learning criteria, including a map for monotonicity restrictions and similar criteria and a map for restrictions on data presentation. Furthermore, we consider, for various learning criteria, whether learners can be assumed consistent.
LOOct 15, 2020
Learning Languages with Decidable HypothesesJulian Berger, Maximilian Böther, Vanja Doskoč et al.
In language learning in the limit, the most common type of hypothesis is to give an enumerator for a language. This so-called $W$-index allows for naming arbitrary computably enumerable languages, with the drawback that even the membership problem is undecidable. In this paper we use a different system which allows for naming arbitrary decidable languages, namely programs for characteristic functions (called $C$-indices). These indices have the drawback that it is now not decidable whether a given hypothesis is even a legal $C$-index. In this first analysis of learning with $C$-indices, we give a structured account of the learning power of various restrictions employing $C$-indices, also when compared with $W$-indices. We establish a hierarchy of learning power depending on whether $C$-indices are required (a) on all outputs; (b) only on outputs relevant for the class to be learned and (c) only in the limit as final, correct hypotheses. Furthermore, all these settings are weaker than learning with $W$-indices (even when restricted to classes of computable languages). We analyze all these questions also in relation to the mode of data presentation. Finally, we also ask about the relation of semantic versus syntactic convergence and derive the map of pairwise relations for these two kinds of convergence coupled with various forms of data presentation.