Philipp Schneider

CL
h-index3
3papers
7citations
Novelty60%
AI Score38

3 Papers

CLFeb 4, 2025
NER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approach

Torsten Hiltmann, Martin Dröge, Nicole Dresselhaus et al.

Named entity recognition (NER) is a core task for historical research in automatically establishing all references to people, places, events and the like. Yet, do to the high linguistic and genre diversity of sources, only limited canonisation of spellings, the level of required historical domain knowledge, and the scarcity of annotated training data, established approaches to natural language processing (NLP) have been both extremely expensive and yielded only unsatisfactory results in terms of recall and precision. Our paper introduces a new approach. We demonstrate how readily-available, state-of-the-art LLMs significantly outperform two leading NLP frameworks, spaCy and flair, for NER in historical documents by seven to twentytwo percent higher F1-Scores. Our ablation study shows how providing historical context to the task and a bit of persona modelling that turns focus away from a purely linguistic approach are core to a successful prompting strategy. We also demonstrate that, contrary to our expectations, providing increasing numbers of examples in few-shot approaches does not improve recall or precision below a threshold of 16-shot. In consequence, our approach democratises access to NER for all historians by removing the barrier of scripting languages and computational skills required for established NLP tools and instead leveraging natural language prompts and consumer-grade tools and frontends.

MLAug 26, 2025
Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits

Mengmeng Li, Philipp Schneider, Jelisaveta Aleksić et al.

We introduce the first best-of-both-worlds algorithm for contextual combinatorial semi-bandits that simultaneously guarantees $\widetilde{\mathcal{O}}(\sqrt{T})$ regret in the adversarial regime and $\widetilde{\mathcal{O}}(\ln T)$ regret in the corrupted stochastic regime. Our approach builds on the Follow-the-Regularized-Leader (FTRL) framework equipped with a Shannon entropy regularizer, yielding a flexible method that admits efficient implementations. Beyond regret bounds, we tackle the practical bottleneck in FTRL (or, equivalently, Online Stochastic Mirror Descent) arising from the high-dimensional projection step encountered in each round of interaction. By leveraging the Karush-Kuhn-Tucker conditions, we transform the $K$-dimensional convex projection problem into a single-variable root-finding problem, dramatically accelerating each round. Empirical evaluations demonstrate that this combined strategy not only attains the attractive regret bounds of best-of-both-worlds algorithms but also delivers substantial per-round speed-ups, making it well-suited for large-scale, real-time applications.

CVAug 5, 2025
Advancing Precision in Multi-Point Cloud Fusion Environments

Ulugbek Alibekov, Vanessa Staderini, Philipp Schneider et al.

This research focuses on visual industrial inspection by evaluating point clouds and multi-point cloud matching methods. We also introduce a synthetic dataset for quantitative evaluation of registration method and various distance metrics for point cloud comparison. Additionally, we present a novel CloudCompare plugin for merging multiple point clouds and visualizing surface defects, enhancing the accuracy and efficiency of automated inspection systems.