Domenico Ursino

h-index30

3papers

2,796citations

3 Papers

16.2CLJul 16

Latent Trajectory Discrimination for AI-Generated Text Detection

Gianluca Bonifazi, Christopher Buratti, Michele Marchetti et al.

Most existing approaches to AI-Generated Text Detection (AIGTD) treat documents as static objects and base their decisions on aggregate statistics or globally compressed embeddings. However, this perspective overlooks the inherently dynamic nature of autoregressive generation, where content evolves progressively through the latent space. In this paper, we reformulate AIGTD as the problem of distinguishing between latent generation trajectories. Instead of relying on static representations, we model how textual representations evolve across the sequence. To this end, we propose Geometric Trajectory and Contrastive Learning (GTCL), a framework that segments the document into ordered local units, encodes each unit in an embedding space, and constructs a structured and sequence-level representation. GTCL then applies contrastive learning to these trajectories to learn geometric regularities associated with the autoregressive generation. Evaluations performed on three different benchmarks and several approaches show that GTCL outperforms detection baselines consistently, which implies that explicitly modeling sequential dynamics provides robust discriminative signals across models and domains. These results suggest that modeling trajectory differences could improve detection and open up a dynamic direction that has been underexplored in previous AIGTD literature.

20.9CLJun 29

Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs

Gianluca Bonifazi, Christopher Buratti, Michele Marchetti et al.

Retrieval-Augmented Generation (RAG) mitigates hallucinations in Large Language Models (LLMs) by grounding the generation process on external knowledge. However, standard RAG approaches struggle with multi-hop reasoning. While recent graph-based RAG methods improve the retrieval of interconnected chunks, they often rely on computationally expensive and error-prone LLM-based extraction pipelines. To address these issues, we propose TIGRAG (Token-Induced GraphRAG), an efficient graph-augmented RAG framework based on a token co-occurrence Knowledge Graph. TIGRAG directly models topological relationships between tokens using sliding-window co-occurrence statistics, thus enabling scalable graph construction. During inference, it combines graph-based semantic expansion and neural reranking to retrieve interconnected evidence for multi-hop reasoning. Specifically, it introduces an iterative entity-driven retrieval strategy that progressively expands the query using bridging entities extracted from previously retrieved contexts. We evaluated TIGRAG on three widely adopted multi-hop Question Answering (QA) benchmarks. Experimental results demonstrated that our framework consistently outperforms dense retrieval and graph-based RAG methods in both retrieval and downstream QA tasks, while substantially reducing indexing time, inference latency, and prompt footprint.

1.2DBJul 10, 2014

XML Matchers: approaches and challenges

Santa Agreste, Pasquale De Meo, Emilio Ferrara et al.

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.