Javier Perez-Robles

2papers

2 Papers

9.3AIMay 7

From Token Lists to Graph Motifs: Weisfeiler-Lehman Analysis of Sparse Autoencoder Features

Ruben Fernandez-Boullon, Pablo Magariños-Docampo, Javier Perez-Robles

Sparse autoencoders (SAEs) have become central to mechanistic interpretability, decomposing transformer activations into monosemantic features. Yet existing analyses characterise features almost exclusively through top-activating token lists or decoder weight vectors, leaving the higher-order co-occurrence structure shared across features largely unexamined. We introduce a graph-structured representation in which each SAE feature is modelled as a token co-occurrence graph: nodes are the tokens most frequent near strong activations, and edges connect pairs that co-occur within local context windows. A custom WL-style, frequency-binned graph kernel then provides a similarity measure over this structural space. Applied as a proof of concept to features from a large SAE trained on GPT-2 Small and probed with a synthetic mixed-domain corpus, our clustering recovers heuristic motif families (punctuation-heavy patterns, language and script clusters, and code-like templates) that are not recovered by clustering on decoder cosine similarity. A token-histogram baseline achieves higher overall purity, so the contribution of the graph view is complementary rather than dominant: it surfaces structural relationships that token-frequency and decoder-weight views alone do not capture. Cluster assignments are stable across graph-construction hyperparameters and random seeds.

3.5ROMar 15

A Modular Architecture Design for Autonomous Driving Racing in Controlled Environments

Brais Fontan-Costas, M. Diaz-Cacho, Ruben Fernandez-Boullon et al.

This paper presents a modular autonomous driving architecture for Formula Student Driverless competition vehicles operating in closed-circuit environments. The perception module employs YOLOv11 for real-time traffic cone detection, achieving 0.93 mAP@0.5 on the FSOCO dataset, combined with neural stereo depth estimation from a ZED 2i camera for 3D cone localization with sub-0.5 m median error at distances up to 7 m. State estimation fuses RTK-GNSS positioning and IMU measurements through an Extended Kalman Filter (EKF) based on a kinematic bicycle model, achieving centimeter-level localization accuracy with a 12 cm improvement over raw GNSS. Path planning computes the racing line via cubic spline interpolation on ordered track boundaries and assigns speed profiles constrained by curvature and vehicle dynamics. A regulated pure pursuit controller tracks the planned trajectory with a dynamic lookahead parameterized by speed error. The complete pipeline is implemented as a modular ROS 2 architecture on an NVIDIA Jetson Orin NX platform, with each subsystem deployed as independent nodes communicating through a dual-computer configuration. Experimental validation combines real-world sensor evaluation with simulation-based end-to-end testing, where realistic sensor error distributions are injected to assess system-level performance under representative conditions.