SYMay 19, 2016
A benchmark for data-based office modeling: challenges related to CO$_2$ dynamicsRiccardo Sven Risuleo, Marco Molinari, Giulio Bottegal et al.
This paper describes a benchmark consisting of a set of synthetic measurements relative to an office environment simulated with the software IDA-ICE. The simulated environment reproduces a laboratory at the KTH-EES Smart Building, equipped with a building management system. The data set contains records collected over a period of several days. The signals to CO$_2$ concentration, mechanical ventilation airflows, air infiltrations and occupancy. Information on door and window opening is also available. This benchmark is intended for testing data-based modeling techniques. The ultimate goal is the development of models to improve the forecast and control of environmental variables. Among the numerous challenges related to this framework, we point out the problem of occupancy estimation using information on CO$_2$ concentration. This can be seen as a blind identification problem. For benchmarking purposes, we present two different identification approaches: a baseline overparametrization method and a kernel-based method.
SYApr 16
Generalizability of Learning-based Occupancy Detection in Residential BuildingsMahsa Farjadnia, Katayoun Eshkofti, Albin Apell et al.
This paper investigates non-intrusive occupancy detection methods for residential buildings using environmental sensor data from the KTH Live-In Lab in Stockholm, Sweden. Three machine learning approaches, namely, logistic regression (LR), support vector machines (SVM), and long short-term memory (LSTM) network enhanced with an attention mechanism, are evaluated in terms of predictive performance and computational complexity. The analysis considers the trade-off between sensor availability (investment cost) and prediction accuracy in real applications, as well as the models' cross-apartment generalizability. Hyperparameters for both the SVM and LSTM models are optimized using Bayesian optimization. All three models are evaluated on data collected from apartments not used during training, and on data generated from a calibrated digital model of the testbed. Results show that all models achieve comparable performance on the same-apartment test data (accuracy of approximately 0.83, F1 score of approximately 0.86). When assessed on cross-apartment data, the LSTM model demonstrates the strongest generalization capability (accuracy of 0.84, F1 score of 0.85), while LR provides a competitive, low-complexity alternative for applications that do not require cross-apartment generalization.
CLOct 28, 2024
Group-SAE: Efficient Training of Sparse Autoencoders for Large Language Models via Layer GroupsDavide Ghilardi, Federico Belotti, Marco Molinari et al.
SAEs have recently been employed as a promising unsupervised approach for understanding the representations of layers of Large Language Models (LLMs). However, with the growth in model size and complexity, training SAEs is computationally intensive, as typically one SAE is trained for each model layer. To address such limitation, we propose \textit{Group-SAE}, a novel strategy to train SAEs. Our method considers the similarity of the residual stream representations between contiguous layers to group similar layers and train a single SAE per group. To balance the trade-off between efficiency and performance, we further introduce \textit{AMAD} (Average Maximum Angular Distance), an empirical metric that guides the selection of an optimal number of groups based on representational similarity across layers. Experiments on models from the Pythia family show that our approach significantly accelerates training with minimal impact on reconstruction quality and comparable downstream task performance and interpretability over baseline SAEs trained layer by layer. This method provides an efficient and scalable strategy for training SAEs in modern LLMs.
LGSep 29, 2025
Emergent World Representations in OpenVLAMarco Molinari, Leonardo Nevali, Saharsha Navani et al.
Vision Language Action models (VLAs) trained with policy-based reinforcement learning (RL) encode complex behaviors without explicitly modeling environmental dynamics. However, it remains unclear whether VLAs implicitly learn world models, a hallmark of model-based RL. We propose an experimental methodology using embedding arithmetic on state representations to probe whether OpenVLA, the current state of the art in VLAs, contains latent knowledge of state transitions. Specifically, we measure the difference between embeddings of sequential environment states and test whether this transition vector is recoverable from intermediate model activations. Using linear and non linear probes trained on activations across layers, we find statistically significant predictive ability on state transitions exceeding baselines (embeddings), indicating that OpenVLA encodes an internal world model (as opposed to the probes learning the state transitions). We investigate the predictive ability of an earlier checkpoint of OpenVLA, and uncover hints that the world model emerges as training progresses. Finally, we outline a pipeline leveraging Sparse Autoencoders (SAEs) to analyze OpenVLA's world model.
LGMay 18, 2025
Fixed Point ExplainabilityEmanuele La Malfa, Jon Vadillo, Marco Molinari et al. · oxford
This paper introduces a formal notion of fixed point explanations, inspired by the "why regress" principle, to assess, through recursive applications, the stability of the interplay between a model and its explainer. Fixed point explanations satisfy properties like minimality, stability, and faithfulness, revealing hidden model behaviours and explanatory weaknesses. We define convergence conditions for several classes of explainers, from feature-based to mechanistic tools like Sparse AutoEncoders, and we report quantitative and qualitative results for several datasets and models, including LLMs such as Llama-3.3-70B.
CLDec 3, 2024
Interpretable Company Similarity with Sparse AutoencodersMarco Molinari, Victor Shao, Luca Imeneo et al.
Determining company similarity is a vital task in finance, underpinning risk management, hedging, and portfolio diversification. Practitioners often rely on sector and industry classifications such as SIC and GICS codes to gauge similarity, the former being used by the U.S. Securities and Exchange Commission (SEC), and the latter widely used by the investment community. Since these classifications lack granularity and need regular updating, using clusters of embeddings of company descriptions has been proposed as a potential alternative, but the lack of interpretability in token embeddings poses a significant barrier to adoption in high-stakes contexts. Sparse Autoencoders (SAEs) have shown promise in enhancing the interpretability of Large Language Models (LLMs) by decomposing Large Language Model (LLM) activations into interpretable features. Moreover, SAEs capture an LLM's internal representation of a company description, as opposed to semantic similarity alone, as is the case with embeddings. We apply SAEs to company descriptions, and obtain meaningful clusters of equities. We benchmark SAE features against SIC-codes, Industry codes, and Embeddings. Our results demonstrate that SAE features surpass sector classifications and embeddings in capturing fundamental company characteristics. This is evidenced by their superior performance in correlating logged monthly returns - a proxy for similarity - and generating higher Sharpe ratios in co-integration trading strategies, which underscores deeper fundamental similarities among companies. Finally, we verify the interpretability of our clusters, and demonstrate that sparse features form simple and interpretable explanations for our clusters.