LGJun 4
Maximising the Set-Piece Return: Optimising Football Corner Tactics with Graph Reinforcement LearningSean Groom, Michael Groom, Francisco Belo et al.
Machine learning is increasingly employed for the evaluation of football tactics. However, existing approaches focus on characterising historical actions or analyst-specified counterfactual scenarios. In this work, we seek to go beyond the imitation of historically observed patterns towards discovering new generalisable player configurations and strategies. To tackle this, we focus on optimising corner kick routines, and formulate a decision-making problem in which a central policy makes adjustments to attacking player positions and velocities to maximise first contact shot probability. Unlike classic optimisation that solves for isolated setups, we contribute a reinforcement learning architecture operating on graph-structured data that yields a general policy for adjusting arbitrary starting player positions. Evaluated on over 3,000 Premier League corners, our approach strongly outperforms baseline optimisation techniques under matched inference budgets. Our results suggest that graph reinforcement learning can shift set-piece analysis from historical evaluation and imitation towards reward-driven tactical discovery.
LGJan 2
A Machine Learning Framework for Off Ball Defensive Role and Performance Evaluation in FootballSean Groom, Shuo Wang, Francisco Belo et al.
Evaluating off-ball defensive performance in football is challenging, as traditional metrics do not capture the nuanced coordinated movements that limit opponent action selection and success probabilities. Although widely used possession value models excel at appraising on-ball actions, their application to defense remains limited. Existing counterfactual methods, such as ghosting models, help extend these analyses but often rely on simulating "average" behavior that lacks tactical context. To address this, we introduce a covariate-dependent Hidden Markov Model (CDHMM) tailored to corner kicks, a highly structured aspect of football games. Our label-free model infers time-resolved man-marking and zonal assignments directly from player tracking data. We leverage these assignments to propose a novel framework for defensive credit attribution and a role-conditioned ghosting method for counterfactual analysis of off-ball defensive performance. We show how these contributions provide a interpretable evaluation of defensive contributions against context-aware baselines.
IRNov 8, 2025
Ontology Learning and Knowledge Graph Construction: A Comparison of Approaches and Their Impact on RAG PerformanceTiago da Cruz, Bernardo Tavares, Francisco Belo
Retrieval-Augmented Generation (RAG) systems combine Large Language Models (LLMs) with external knowledge, and their performance depends heavily on how that knowledge is represented. This study investigates how different Knowledge Graph (KG) construction strategies influence RAG performance. We compare a variety of approaches: standard vector-based RAG, GraphRAG, and retrieval over KGs built from ontologies derived either from relational databases or textual corpora. Results show that ontology-guided KGs incorporating chunk information achieve competitive performance with state-of-the-art frameworks, substantially outperforming vector retrieval baselines. Moreover, the findings reveal that ontology-guided KGs built from relational databases perform competitively to ones built with ontologies extracted from text, with the benefit of offering a dual advantage: they require a one-time-only ontology learning process, substantially reducing LLM usage costs; and avoid the complexity of ontology merging inherent to text-based approaches.
CLSep 26, 2025
CRACQ: A Multi-Dimensional Approach To Automated Document AssessmentIshak Soltani, Francisco Belo, Bernardo Tavares
This paper presents CRACQ, a multi-dimensional evaluation framework tailored to evaluate documents across f i v e specific traits: Coherence, Rigor, Appropriateness, Completeness, and Quality. Building on insights from traitbased Automated Essay Scoring (AES), CRACQ expands its fo-cus beyond essays to encompass diverse forms of machine-generated text, providing a rubricdriven and interpretable methodology for automated evaluation. Unlike singlescore approaches, CRACQ integrates linguistic, semantic, and structural signals into a cumulative assessment, enabling both holistic and trait-level analysis. Trained on 500 synthetic grant pro-posals, CRACQ was benchmarked against an LLM-as-a-judge and further tested on both strong and weak real applications. Preliminary results in-dicate that CRACQ produces more stable and interpretable trait-level judgments than direct LLM evaluation, though challenges in reliability and domain scope remain