Claudio Altafini

LG
h-index29
8papers
4citations
Novelty33%
AI Score46

8 Papers

LGMay 25
Analogies between Transformer Layers and Power Method

Chenglong Li, Claudio Altafini

In the paper we show that there is an analogy between the operations occurring in a layer of a transformer (projections and layer normalizations, disregarding the feedforward neural network) and a step in the power method. Coherently with this analogy, we show that passing through a layer the tokens tend to be tilted towards the principal eigenvector of a matrix which is the product of the output and value weight matrices of that layer. In the special case of a transformer with shared weights (i.e., in which all layers have identical weights) then the alignment with this principal eigenvector is particularly evident empirically, and can also be shown analytically. The analogy also suggests a method to steer the output of the transformer towards an arbitrary desired direction in token space.

SYApr 16
Minimal Input Cardinality Disturbance Decoupling of Coupled Oscillators via Output Feedback with Application to Power Networks

Luca Claude Gino Lebon, Johan Lindberg, Claudio Altafini

In this paper, we identify the smallest set of control input nodes and an associated output feedback law that achieves complete disturbance decoupling for a class of coupled oscillator networks. The focus is specifically on systems linearized around a stable phase-locked synchronized state. The proposed theoretical framework is applied to the linearized swing dynamics of power grids operating near synchronization. In this context, the disturbance decoupling problem corresponds to isolating subsets of nodes from exogenous disturbances by means of batteries that can both add or withdraw active power. Numerical simulations carried out on the IEEE New England 39-bus system show that the proposed methodology not only yields a minimal actuator placement ensuring effective disturbance rejection, but also preserves the internal stability of the closed-loop system.

OCMar 15
Geometric Control Theory Over Networks: Minimal Node Cardinality Disturbance Decoupling Problems

Luca Claude Gino Lebon, Claudio Altafini

In this paper we show how to formulate and solve disturbance decoupling problems over networks while choosing a minimal number of input and output nodes. Feedback laws that isolate and eliminate the impact of disturbance nodes on specific target nodes to be protected are provided using state, output, and dynamical feedback. For that, we leverage the fact that when reformulated in terms of sets of nodes rather than subspaces, the controlled and conditional invariance properties admit a simple graphical interpretation. For state and dynamical feedback, the minimal input and output cardinality solutions can be computed exactly in polynomial time, via min-cut/max-flow algorithms.

SOC-PHApr 14
Signed DeGroot-Friedkin Dynamics with Interdependent Topics

Yangyang Luan, Muhammad Ahsan Razaq, Xiaoqun Wu et al.

This paper investigates DeGroot-Friedkin (DF) dynamics over signed influence networks with interdependent topics. We propose a multi-topic signed framework that combines repelling interpersonal interactions with cross-issue self-appraisal, examining how antagonism and topic interdependence shape the evolution of agent-level social power. When the logic matrices (for topic interdependence) of all agents share a common dominant left eigenvector, we identify structural conditions under which the original dynamics admit an exact reduction to an explicit scalar DF map. This yields a complete classification of limiting social power configurations into pluralistic, mixed, and vertex-dominant types. In all three cases, the dynamics are globally convergent, and in the first two the ordering induced by the interaction centrality is preserved. We further show local robustness under small heterogeneous perturbations of the logic matrices. We also clarify what changes when this common-eigenvector structure is lost. These results extend signed social power dynamics beyond the standard nonnegative scalar setting and shed light on the robustness and scope of centrality-based social power formation in multi-topic signed influence systems.

LGNov 13, 2025
Gradient Flow Equations for Deep Linear Neural Networks: A Survey from a Network Perspective

Joel Wendin, Claudio Altafini

The paper surveys recent progresses in understanding the dynamics and loss landscape of the gradient flow equations associated to deep linear neural networks, i.e., the gradient descent training dynamics (in the limit when the step size goes to 0) of deep neural networks missing the activation functions and subject to quadratic loss functions. When formulated in terms of the adjacency matrix of the neural network, as we do in the paper, these gradient flow equations form a class of converging matrix ODEs which is nilpotent, polynomial, isospectral, and with conservation laws. The loss landscape is described in detail. It is characterized by infinitely many global minima and saddle points, both strict and nonstrict, but lacks local minima and maxima. The loss function itself is a positive semidefinite Lyapunov function for the gradient flow, and its level sets are unbounded invariant sets of critical points, with critical values that correspond to the amount of singular values of the input-output data learnt by the gradient along a certain trajectory. The adjacency matrix representation we use in the paper allows to highlight the existence of a quotient space structure in which each critical value of the loss function is represented only once, while all other critical points with the same critical value belong to the fiber associated to the quotient space. It also allows to easily determine stable and unstable submanifolds at the saddle points, even when the Hessian fails to obtain them.

SIApr 25
Quantifying opinion homophily in online social networks: A bounded confidence perspective

Yangyang Luan, Camilla Ancona, Carmela Bernardo et al.

The concept of homophily is pervasive in online social media. While many empirical studies have relied on external sociodemographic traits to investigate it, significantly less is known about homophily at the cognitive level, that is, at the level of shared opinions or values. For such "value homophily", in this paper we study interval-based patterns of opinion homophily from a bounded confidence perspective. We consider three heterogeneous datasets from Reddit and Twitter covering polarizing issues, with user opinions quantified via sentiment analysis and fact-checking, and analyze the interaction networks formed by weaker (reply-based) and stronger (follow-based) social ties. Our findings show that users' interaction neighborhoods are significantly more concentrated in opinion space than expected by chance, with tie strength and issue polarization further amplifying this effect. Moreover, users often exhibit asymmetric tolerance ranges, with asymmetry typically directed toward locally mainstream positions rather than more radical or opposing ones. These findings support a bounded confidence interpretation of online value homophily.

LGNov 14, 2025
Multistability of Self-Attention Dynamics in Transformers

Claudio Altafini

In machine learning, a self-attention dynamics is a continuous-time multiagent-like model of the attention mechanisms of transformers. In this paper we show that such dynamics is related to a multiagent version of the Oja flow, a dynamical system that computes the principal eigenvector of a matrix corresponding for transformers to the value matrix. We classify the equilibria of the ``single-head'' self-attention system into four classes: consensus, bipartite consensus, clustering and polygonal equilibria. Multiple asymptotically stable equilibria from the first three classes often coexist in the self-attention dynamics. Interestingly, equilibria from the first two classes are always aligned with the eigenvectors of the value matrix, often but not exclusively with the principal eigenvector.

LGOct 6, 2025
Computing frustration and near-monotonicity in deep neural networks

Joel Wendin, Erik G. Larsson, Claudio Altafini

For the signed graph associated to a deep neural network, one can compute the frustration level, i.e., test how close or distant the graph is to structural balance. For all the pretrained deep convolutional neural networks we consider, we find that the frustration is always less than expected from null models. From a statistical physics point of view, and in particular in reference to an Ising spin glass model, the reduced frustration indicates that the amount of disorder encoded in the network is less than in the null models. From a functional point of view, low frustration (i.e., proximity to structural balance) means that the function representing the network behaves near-monotonically, i.e., more similarly to a monotone function than in the null models. Evidence of near-monotonic behavior along the partial order determined by frustration is observed for all networks we consider. This confirms that the class of deep convolutional neural networks tends to have a more ordered behavior than expected from null models, and suggests a novel form of implicit regularization.