Michael Foster

LG
h-index45
4papers
691citations
Novelty36%
AI Score46

4 Papers

FLApr 23
Active Inference of Extended Finite State Machine Models with Registers and Guards

Roland Groz, German Eduardo Vega Baez, Adenilso Simao et al.

Extended finite state machines (EFSMs) model stateful systems with internal data variables and have numerous applications in software engineering. A major advantage of this type of model lies in its ability to model both the data flow and the data-dependent control behaviour. In the absence of such models, it is desirable to reverse-engineer them by observing the system's behaviour. However, existing approaches generally require the ability to reset the system during inference, or can only handle situations where the control flow depends exclusively on the input parameters, and not on the values of the stored data. In this work, we present a black-box active learning algorithm that infers EFSMs with guards and registers, and which significantly relaxes the assumptions that have to be made about the system in comparison to previous attempts.

CLMar 3
A Browser-based Open Source Assistant for Multimodal Content Verification

Rosanna Milner, Michael Foster, Olesya Razuvayevskaya et al.

Disinformation and false content produced by generative AI pose a significant challenge for journalists and fact-checkers who must rapidly verify digital media information. While there is an abundance of NLP models for detecting credibility signals such as persuasion techniques, subjectivity, or machine-generated text, such methods often remain inaccessible to non-expert users and are not integrated into their daily workflows as a unified framework. This paper demonstrates the VERIFICATION ASSISTANT, a browser-based tool designed to bridge this gap. The VERIFICATION ASSISTANT, a core component of the widely adopted VERIFICATION PLUGIN (140,000+ users), allows users to submit URLs or media files to a unified interface. It automatically extracts content and routes it to a suite of backend NLP classifiers, delivering actionable credibility signals, estimating AI-generated content, and providing other verification guidance in a clear, easy-to-digest format. This paper showcases the tool architecture, its integration of multiple NLP services, and its real-world application to detecting disinformation.

LGJan 24, 2025
Humanity's Last Exam

Long Phan, Alice Gatti, Ziwen Han et al. · amazon-science, apple-ml

Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.

LGAug 18, 2017
Deep Convolutional Neural Networks for Raman Spectrum Recognition: A Unified Solution

Jinchao Liu, Margarita Osadchy, Lorna Ashton et al.

Machine learning methods have found many applications in Raman spectroscopy, especially for the identification of chemical species. However, almost all of these methods require non-trivial preprocessing such as baseline correction and/or PCA as an essential step. Here we describe our unified solution for the identification of chemical species in which a convolutional neural network is trained to automatically identify substances according to their Raman spectrum without the need of ad-hoc preprocessing steps. We evaluated our approach using the RRUFF spectral database, comprising mineral sample data. Superior classification performance is demonstrated compared with other frequently used machine learning algorithms including the popular support vector machine.