Thien Le

h-index17

7papers

76citations

Novelty49%

AI Score49

Ranked #47,672 of 201,326 authors (top 24%)#10,872 in LG (top 26%)

7 Papers

LGJun 2

Contrastive Neural Algorithmic Reasoning for Graph Coloring

Thien Le, Tianyu Zhao, Melanie Weber

Graph coloring seeks to assigns colors to a graph's nodes so that adjacent nodes receive different colors, using as few colors as possible. Here, we study approximate $k$-coloring, where the goal is to use at most $k$ colors while minimizing the number of monochromatic edges. This problem is central to graph theory and has applications in areas such as scheduling and resource allocation. Recent unsupervised GNN approaches optimize each instance directly, precluding generalization across graph sizes and distributions. We instead propose a contrastive learning framework that learns transferable coloring geometry where the embeddings of same-color nodes align, while adjacent nodes' representations are pushed toward distinct directions. We analyze the resulting population objective over bounded-size graphs. For unit-norm embeddings, we show that its optima have a line-prototype structure: Representations of nodes of the same color collapse to a shared one-dimensional subspace, and edges connect orthogonal subspaces. This geometry yields stationarity conditions in the supervised setting and is preserved by projected subgradient dynamics under a balanced-coloring assumption. In an unnormalized variant, gradient descent has a max-margin bias governed by a quotient-graph hard-margin problem. Experiments on synthetic and real-world graphs show that contrastive GNN encoders generalize effectively and produce low-conflict colorings, matching and sometimes improving on greedy approaches.

LGJun 7, 2023

Limits, approximation and size transferability for GNNs on sparse graphs via graphops

Thien Le, Stefanie Jegelka

Can graph neural networks generalize to graphs that are different from the graphs they were trained on, e.g., in size? In this work, we study this question from a theoretical perspective. While recent work established such transferability and approximation results via graph limits, e.g., via graphons, these only apply non-trivially to dense graphs. To include frequently encountered sparse graphs such as bounded-degree or power law graphs, we take a perspective of taking limits of operators derived from graphs, such as the aggregation operation that makes up GNNs. This leads to the recently introduced limit notion of graphops (Backhausz and Szegedy, 2022). We demonstrate how the operator perspective allows us to develop quantitative bounds on the distance between a finite GNN and its limit on an infinite graph, as well as the distance between the GNN on graphs of different sizes that share structural properties, under a regularity assumption verified for various graph sequences. Our results hold for dense and sparse graphs, and various notions of graph limits.

LGNov 17, 2023

A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs

Thien Le, Luana Ruiz, Stefanie Jegelka

Large-scale graph machine learning is challenging as the complexity of learning models scales with the graph size. Subsampling the graph is a viable alternative, but sampling on graphs is nontrivial as graphs are non-Euclidean. Existing graph sampling techniques require not only computing the spectra of large matrices but also repeating these computations when the graph changes, e.g., grows. In this paper, we introduce a signal sampling theory for a type of graph limit -- the graphon. We prove a Poincaré inequality for graphon signals and show that complements of node subsets satisfying this inequality are unique sampling sets for Paley-Wiener spaces of graphon signals. Exploiting connections with spectral clustering and Gaussian elimination, we prove that such sampling sets are consistent in the sense that unique sampling sets on a convergent graph sequence converge to unique sampling sets on the graphon. We then propose a related graphon signal sampling algorithm for large graphs, and demonstrate its good empirical performance on graph machine learning tasks.

LGMay 19

Towards Distillation Guarantees under Algorithmic Alignment for Combinatorial Optimization

Thien Le, Melanie Weber

Distillation transfers knowledge from a large model trained on broad data to a smaller, more efficient model suitable for deployment. In structured prediction settings, prior knowledge about the task can guide the choice of a target architecture that is algorithmically aligned with the underlying problem. Building on recent learning-theoretic analyses of decision-tree (DT) distillation (Boix-Adsera, 2024), we study when distillation succeeds for combinatorial optimization tasks. We focus on the case where the target model is a graph neural network whose architecture is aligned with a dynamic programming (DP) algorithm for the task. Assuming that the source model is sufficiently rich, formalized through the linear representation hypothesis (LRH) (Elhage et al., 2022; Park et al., 2024), we show that the distillation problem can be solved efficiently in the complexity parameters of the DP transition function, represented as a DT. Our results provide a rigorous sufficient condition for successful distillation in the flavour of algorithmic alignment.

HCMar 30

Improving motor imagery decoding methods for an EEG-based mobile brain-computer interface in the context of the 2024 Cybathlon

Isabel Whiteley Tscherniak, Niels Christopher Thiemann, Ana McWhinnie-Fernández et al.

Motivated by the Cybathlon 2024 competition, we developed a modular, online EEG-based brain-computer interface to address these challenges, increasing accessibility for individuals with severe mobility impairments. Our system uses three mental and motor imagery classes to control up to five control signals. The pipeline consists of four modules: data acquisition, preprocessing, classification, and the transfer function to map classification output to control dimensions. We use three diagonalized structured state-space sequence layers as a deep learning classifier. We developed a training game for our pilot where the mental tasks control the game during quick-time events. We implemented a mobile web application for live user feedback. The components were designed with a human-centred approach in collaboration with the tetraplegic user. We achieve up to 84% classification accuracy in offline analysis using an S4D-layer-based model. In a competition setting, our pilot successfully completed one task; we attribute the reduced performance in this context primarily to factors such as stress and the challenging competition environment. Following the Cybathlon, we further validated our pipeline with the original pilot and an additional participant, achieving a success rate of 73% in real-time gameplay. We also compare our model to the EEGEncoder, which is slower in training but has a higher performance. The S4D model outperforms the reference machine learning models. We provide insights into developing a framework for portable BCIs, bridging the gap between the laboratory and daily life. Specifically, our framework integrates modular design, real-time data processing, user-centred feedback, and low-cost hardware to deliver an accessible and adaptable BCI solution, addressing critical gaps in current BCI applications.

LGJan 3, 2024

On the hardness of learning under symmetries

Bobak T. Kiani, Thien Le, Hannah Lawrence et al.

We study the problem of learning equivariant neural networks via gradient descent. The incorporation of known symmetries ("equivariance") into neural nets has empirically improved the performance of learning pipelines, in domains ranging from biology to computer vision. However, a rich yet separate line of learning theoretic research has demonstrated that actually learning shallow, fully-connected (i.e. non-symmetric) networks has exponential complexity in the correlational statistical query (CSQ) model, a framework encompassing gradient descent. In this work, we ask: are known problem symmetries sufficient to alleviate the fundamental hardness of learning neural nets with gradient descent? We answer this question in the negative. In particular, we give lower bounds for shallow graph neural networks, convolutional networks, invariant polynomials, and frame-averaged networks for permutation subgroups, which all scale either superpolynomially or exponentially in the relevant input dimension. Therefore, in spite of the significant inductive bias imparted via symmetry, actually learning the complete classes of functions represented by equivariant neural networks via gradient descent remains hard.

LGJan 28, 2022

Training invariances and the low-rank phenomenon: beyond linear networks

Thien Le, Stefanie Jegelka

The implicit bias induced by the training of neural networks has become a topic of rigorous study. In the limit of gradient flow and gradient descent with appropriate step size, it has been shown that when one trains a deep linear network with logistic or exponential loss on linearly separable data, the weights converge to rank-1 matrices. In this paper, we extend this theoretical result to the last few linear layers of the much wider class of nonlinear ReLU-activated feedforward networks containing fully-connected layers and skip connections. Similar to the linear case, the proof relies on specific local training invariances, sometimes referred to as alignment, which we show to hold for submatrices where neurons are stably-activated in all training examples, and it reflects empirical results in the literature. We also show this is not true in general for the full matrix of ReLU fully-connected layers. Our proof relies on a specific decomposition of the network into a multilinear function and another ReLU network whose weights are constant under a certain parameter directional convergence.