Kostas Stathis

h-index19

12papers

106citations

Novelty37%

AI Score35

Ranked #103,821 of 194,257 authors (top 53%)#6,350 in AI (top 51%)

12 Papers

3.6CLJun 29, 2023

A negation detection assessment of GPTs: analysis with the xNot360 dataset

Ha Thanh Nguyen, Randy Goebel, Francesca Toni et al.

Negation is a fundamental aspect of natural language, playing a critical role in communication and comprehension. Our study assesses the negation detection performance of Generative Pre-trained Transformer (GPT) models, specifically GPT-2, GPT-3, GPT-3.5, and GPT-4. We focus on the identification of negation in natural language using a zero-shot prediction approach applied to our custom xNot360 dataset. Our approach examines sentence pairs labeled to indicate whether the second sentence negates the first. Our findings expose a considerable performance disparity among the GPT models, with GPT-4 surpassing its counterparts and GPT-3.5 displaying a marked performance reduction. The overall proficiency of the GPT models in negation detection remains relatively modest, indicating that this task pushes the boundaries of their natural language understanding capabilities. We not only highlight the constraints of GPT models in handling negation but also emphasize the importance of logical reliability in high-stakes domains such as healthcare, science, and law.

2.1CLSep 11, 2023

Black-Box Analysis: GPTs Across Time in Legal Textual Entailment Task

Ha-Thanh Nguyen, Randy Goebel, Francesca Toni et al.

The evolution of Generative Pre-trained Transformer (GPT) models has led to significant advancements in various natural language processing applications, particularly in legal textual entailment. We present an analysis of GPT-3.5 (ChatGPT) and GPT-4 performances on COLIEE Task 4 dataset, a prominent benchmark in this domain. The study encompasses data from Heisei 18 (2006) to Reiwa 3 (2021), exploring the models' abilities to discern entailment relationships within Japanese statute law across different periods. Our preliminary experimental results unveil intriguing insights into the models' strengths and weaknesses in handling legal textual entailment tasks, as well as the patterns observed in model performance. In the context of proprietary models with undisclosed architectures and weights, black-box analysis becomes crucial for evaluating their capabilities. We discuss the influence of training data distribution and the implications on the models' generalizability. This analysis serves as a foundation for future research, aiming to optimize GPT-based models and enable their successful adoption in legal information extraction and entailment applications.

7.4SEJun 21, 2022

World of Bugs: A Platform for Automated Bug Detection in 3D Video Games

Benedict Wilkins, Kostas Stathis

We present World of Bugs (WOB), an open platform that aims to support Automated Bug Detection (ABD) research in video games. We discuss some open problems in ABD and how they relate to the platform's design, arguing that learning-based solutions are required if further progress is to be made. The platform's key feature is a growing collection of common video game bugs that may be used for training and evaluating ABD approaches.

3.9AINov 23, 2023

Towards Explainable Strategy Templates using NLP Transformers

Pallavi Bagga, Kostas Stathis

This paper bridges the gap between mathematical heuristic strategies learned from Deep Reinforcement Learning (DRL) in automated agent negotiation, and comprehensible, natural language explanations. Our aim is to make these strategies more accessible to non-experts. By leveraging traditional Natural Language Processing (NLP) techniques and Large Language Models (LLMs) equipped with Transformers, we outline how parts of DRL strategies composed of parts within strategy templates can be transformed into user-friendly, human-like English narratives. To achieve this, we present a top-level algorithm that involves parsing mathematical expressions of strategy templates, semantically interpreting variables and structures, generating rule-based primary explanations, and utilizing a Generative Pre-trained Transformer (GPT) model to refine and contextualize these explanations. Subsequent customization for varied audiences and meticulous validation processes in an example illustrate the applicability and potential of this approach.

3.3AINov 12, 2025

Proceedings of the Second International Workshop on Next-Generation Language Models for Knowledge Representation and Reasoning (NeLaMKRR 2025)

Ha-Thanh Nguyen, Ken Satoh, Francesca Toni et al.

Reasoning is an essential component of human intelligence in that it plays a fundamental role in our ability to think critically, support responsible decisions, and solve challenging problems. Traditionally, AI has addressed reasoning in the context of logic-based representations of knowledge. However, the recent leap forward in natural language processing, with the emergence of language models based on transformers, is hinting at the possibility that these models exhibit reasoning abilities, particularly as they grow in size and are trained on more and more data. Still, despite ongoing discussions about what reasoning is in language models, it is still not easy to articulate to what extent these models are actually capable of reasoning. The goal of this workshop is to create a platform for researchers from different disciplines and/or AI perspectives to explore approaches and techniques with the aim to reconcile reasoning between language models using transformers and logic-based representations. The specific objectives include analysing the reasoning abilities of language models measured alongside KR methods, injecting KR-style reasoning abilities into language models (including by neuro-symbolic means), and formalising the kind of reasoning language models carry out. This exploration aims to uncover how language models can effectively integrate and leverage knowledge and reasoning with it, thus improving their application and utility in areas where precision and reliability are key requirements.

7.4SEFeb 25, 2022Code

Learning to Identify Perceptual Bugs in 3D Video Games

Benedict Wilkins, Kostas Stathis

Automated Bug Detection (ABD) in video games is composed of two distinct but complementary problems: automated game exploration and bug identification. Automated game exploration has received much recent attention, spurred on by developments in fields such as reinforcement learning. The complementary problem of identifying the bugs present in a player's experience has for the most part relied on the manual specification of rules. Although it is widely recognised that many bugs of interest cannot be identified with such methods, little progress has been made in this direction. In this work we show that it is possible to identify a range of perceptual bugs using learning-based methods by making use of only the rendered game screen as seen by the player. To support our work, we have developed World of Bugs (WOB) an open platform for testing ABD methods in 3D game environments.

3.3MASep 17, 2020

Learnable Strategies for Bilateral Agent Negotiation over Multiple Issues

Pallavi Bagga, Nicola Paoletti, Kostas Stathis

We present a novel bilateral negotiation model that allows a self-interested agent to learn how to negotiate over multiple issues in the presence of user preference uncertainty. The model relies upon interpretable strategy templates representing the tactics the agent should employ during the negotiation and learns template parameters to maximize the average utility received over multiple negotiations, thus resulting in optimal bid acceptance and generation. Our model also uses deep reinforcement learning to evaluate threshold utility values, for those tactics that require them, thereby deriving optimal utilities for every environment state. To handle user preference uncertainty, the model relies on a stochastic search to find user model that best agrees with a given partial preference profile. Multi-objective optimization and multi-criteria decision-making methods are applied at negotiation time to generate Pareto-optimal outcomes thereby increasing the number of successful (win-win) negotiations. Rigorous experimental evaluations show that the agent employing our model outperforms the winning agents of the 10th Automated Negotiating Agents Competition (ANAC'19) in terms of individual as well as social-welfare utilities.

7.2LGMay 20, 2020Code

A Metric Learning Approach to Anomaly Detection in Video Games

Benedict Wilkins, Chris Watkins, Kostas Stathis

With the aim of designing automated tools that assist in the video game quality assurance process, we frame the problem of identifying bugs in video games as an anomaly detection (AD) problem. We develop State-State Siamese Networks (S3N) as an efficient deep metric learning approach to AD in this context and explore how it may be used as part of an automated testing tool. Finally, we show by empirical evaluation on a series of Atari games, that S3N is able to learn a meaningful embedding, and consequently is able to identify various common types of video game bugs.

5.1MAJan 31, 2020

A Deep Reinforcement Learning Approach to Concurrent Bilateral Negotiation

Pallavi Bagga, Nicola Paoletti, Bedour Alrayes et al.

We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning-based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.

4.7LGDec 17, 2015

Probabilistic Programming with Gaussian Process Memoization

Ulrich Schaechtle, Ben Zinberg, Alexey Radul et al.

Gaussian Processes (GPs) are widely used tools in statistics, machine learning, robotics, computer vision, and scientific computation. However, despite their popularity, they can be difficult to apply; all but the simplest classification or regression applications require specification and inference over complex covariance functions that do not admit simple analytical posteriors. This paper shows how to embed Gaussian processes in any higher-order probabilistic programming language, using an idiom based on memoization, and demonstrates its utility by implementing and extending classic and state-of-the-art GP applications. The interface to Gaussian processes, called gpmem, takes an arbitrary real-valued computational process as input and returns a statistical emulator that automatically improve as the original process is invoked and its input-output behavior is recorded. The flexibility of gpmem is illustrated via three applications: (i) robust GP regression with hierarchical hyper-parameter learning, (ii) discovering symbolic expressions from time-series data by fully Bayesian structure learning over kernels generated by a stochastic grammar, and (iii) a bandit formulation of Bayesian optimization with automatic inference and action selection. All applications share a single 50-line Python library and require fewer than 20 lines of probabilistic code each.

7.4AIJan 15, 2014

Computational Logic Foundations of KGP Agents

Antonis Kakas, Paolo Mancarella, Fariba Sadri et al.

This paper presents the computational logic foundations of a model of agency called the KGP (Knowledge, Goals and Plan model. This model allows the specification of heterogeneous agents that can interact with each other, and can exhibit both proactive and reactive behaviour allowing them to function in dynamic environments by adjusting their goals and plans when changes happen in such environments. KGP provides a highly modular agent architecture that integrates a collection of reasoning and physical capabilities, synthesised within transitions that update the agents state in response to reasoning, sensing and acting. Transitions are orchestrated by cycle theories that specify the order in which transitions are executed while taking into account the dynamic context and agent preferences, as well as selection operators for providing inputs to transitions.

5.6AIDec 8, 2013

CLIC: A Framework for Distributed, On-Demand, Human-Machine Cognitive Systems

N. Mavridis, S. Konstantopoulos, I. Vetsikas et al.

Traditional Artificial Cognitive Systems (for example, intelligent robots) share a number of limitations. First, they are usually made up only of machine components; humans are only playing the role of user or supervisor. And yet, there are tasks in which the current state of the art of AI has much worse performance or is more expensive than humans: thus, it would be highly beneficial to have a systematic way of creating systems with both human and machine components, possibly with remote non-expert humans providing short-duration real-time services. Second, their components are often dedicated to only one system, and underutilized for a big part of their lifetime. Third, there is no inherent support for robust operation, and if a new better component becomes available, one cannot easily replace the old component. Fourth, they are viewed as a resource to be developed and owned, not as a utility. Thus, we are presenting CLIC: a framework for constructing cognitive systems that overcome the above limitations. The architecture of CLIC provides specific mechanisms for creating and operating cognitive systems that fulfill a set of desiderata: First, that are distributed yet situated, interacting with the physical world though sensing and actuation services, and that are also combining human as well as machine services. Second, that are made up of components that are time-shared and re-usable. Third, that provide increased robustness through self-repair. Fourth, that are constructed and reconstructed on the fly, with components that dynamically enter and exit the system during operation, on the basis of availability, pricing, and need. Importantly, fifth, the cognitive systems created and operated by CLIC do not need to be owned and can be provided on demand, as a utility; thus transforming human-machine situated intelligence to a service, and opening up many interesting opportunities.